Public bug reported:
== Comment: #0 - Praveen K. Pandey <[email protected]> - 2016-07-17
02:37:31 ==
Hi
In Ubuntu16.10 I I tried fadump in Brazos system (32TB Memory and 192
core) , when trigger panic in kernel panic occur and console got hung.
Reproducible Step:
1- Install Ubuntu16.10
2- boot system with 31TB and 192 Core
3- configure fadump in system
4- verify fadump in system that it is running
5- Trigger panic in system
Actual Result
Not able to take Fadump , kernel panic and console got hung
Expected Result
Fadump will be captured
Log:
root@ltc-brazos1:~# kdump-config show
DUMP_MODE: fadump
USE_KDUMP: 1
KDUMP_SYSCTL: kernel.panic_on_oops=1
KDUMP_COREDIR: /var/crash
/var/lib/kdump/vmlinuz: symbolic link to /boot/vmlinux-4.4.0-30-generic
kdump initrd:
/var/lib/kdump/initrd.img: symbolic link to
/var/lib/kdump/initrd.img-4.4.0-30-generic
current state: ready to fadump
root@ltc-brazos1:~#
root@ltc-brazos1:~# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinux-4.4.0-30-generic
root=UUID=516c4b1b-6700-4b55-bd37-d61c4c5af6af ro quiet splash fadump=on
fadump_reserve_mem=4096M crashkernel=4096M
root@ltc-brazos1:~#
ltc-brazos1 login: [ 442.749993] sysrq: SysRq : Trigger a crash
[ 442.750031] Unable to handle kernel paging request for data at address
0x00000000
[ 442.750037] Faulting instruction address: 0xc000000000670014
[ 442.750043] Oops: Kernel access of bad area, sig: 11 [#1]
[ 442.750047] SMP NR_CPUS=2048 NUMA pSeries
[ 442.750053] Modules linked in: pseries_rng btrfs xor raid6_pq rtc_generic
sunrpc autofs4 ses enclosure ipr
[ 442.750068] CPU: 157 PID: 403890 Comm: bash Not tainted 4.4.0-30-generic
#49-Ubuntu
[ 442.750074] task: c00003f97b0af640 ti: c00003f97b104000 task.ti:
c00003f97b104000
[ 442.750079] NIP: c000000000670014 LR: c0000000006710c8 CTR: c00000000066ffe0
[ 442.750083] REGS: c00003f97b107990 TRAP: 0300 Not tainted
(4.4.0-30-generic)
[ 442.750088] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28242222 XER:
00000001
[ 442.750100] CFAR: c000000000008468 DAR: 0000000000000000 DSISR: 42000000
SOFTE: 1
GPR00: c0000000006710c8 c00003f97b107c10 c0000000015b5d00 0000000000000063
GPR04: c00001faba749c50 c00001faba75b4e0 c0001f3efe7c0000 0000000000000313
GPR08: 0000000000000007 0000000000000001 0000000000000000 c0001f3efe7cecb8
GPR12: c00000000066ffe0 c00000000bc9d380 ffffffffffffffff 0000000022000000
GPR16: 0000000010170dc8 000001001ef401d8 0000000010140f58 00000000100c7570
GPR20: 0000000000000000 000000001017dd58 0000000010153618 000000001017b608
GPR24: 00003ffff7c9e7b4 0000000000000001 c0000000014f8e58 0000000000000004
GPR28: c0000000014f9218 0000000000000063 c0000000014b11dc 0000000000000000
[ 442.750165] NIP [c000000000670014] sysrq_handle_crash+0x34/0x50
[ 442.750170] LR [c0000000006710c8] __handle_sysrq+0xe8/0x270
[ 442.750174] Call Trace:
[ 442.750179] [c00003f97b107c10] [c000000000e08f28]
_fw_tigon_tg3_bin_name+0x2ce58/0x342b0 (unreliable)
[ 442.750186] [c00003f97b107c30] [c0000000006710c8] __handle_sysrq+0xe8/0x270
[ 442.750192] [c00003f97b107cd0] [c000000000671868]
write_sysrq_trigger+0x78/0xa0
[ 442.750199] [c00003f97b107d00] [c00000000037ae30] proc_reg_write+0xb0/0x110
[ 442.750205] [c00003f97b107d50] [c0000000002e186c] __vfs_write+0x6c/0xe0
[ 442.750210] [c00003f97b107d90] [c0000000002e25a0] vfs_write+0xc0/0x230
[ 442.750216] [c00003f97b107de0] [c0000000002e35dc] SyS_write+0x6c/0x110
[ 442.750222] [c00003f97b107e30] [c000000000009204] system_call+0x38/0xb4
[ 442.750226] Instruction dump:
[ 442.750229] 38425d20 7c0802a6 f8010010 f821ffe1 60000000 60000000 3d220019
394931e4
[ 442.750238] 39200001 912a0000 7c0004ac 39400000 <992a0000> 38210020 e8010010
7c0803a6
[ 442.750248] ---[ end trace ff61e1bc4dd59a42 ]---
[ 442.752585]
Loading Linux 4.4.0-30-generic ...
Loading initial ramdisk ...
OF stdout device is: /vdevice/vty@30000000
Preparing to boot Linux version 4.4.0-30-generic (buildd@bos01-ppc64el-023)
(gcc version 5.3.1 20160413 (Ubuntu/IBM 5.3.1-14ubuntu2.1) ) #49-Ubuntu SMP Fri
Jul 1 10:00:36 UTC 2016 (Ubuntu 4.4.0-30.49-generic 4.4.13)
Detected machine type: 0000000000000101
Max number of cores passed to firmware: 256 (NR_CPUS = 2048)
Calling ibm,client-architecture-support... done
command line: BOOT_IMAGE=/boot/vmlinux-4.4.0-30-generic
root=UUID=516c4b1b-6700-4b55-bd37-d61c4c5af6af ro quiet splash fadump=on
fadump_reserve_mem=4096M crashkernel=4096M
Ignoring mem=0000000100000000 >= ram_top.
memory layout at init:
memory_limit : 0000000000000000 (16 MB aligned)
alloc_bottom : 000000000e020000
alloc_top : 0000000010000000
alloc_top_hi : 0000000010000000
rmo_top : 0000000010000000
ram_top : 0000000010000000
instantiating rtas at 0x000000000e9e0000... done
prom_hold_cpus: skipped
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x000000000e030000 -> 0x000000000e0319a4
Device tree struct 0x000000000e040000 -> 0x000000000e640000
Quiescing Open Firmware ...
Booting Linux via __start() ...
-> smp_release_cpus()
spinning_secondaries = 1535
<- smp_release_cpus()
<- setup_system()
[ 0.000000] Kernel panic - not syncing: memblock_virt_alloc_try_nid: Failed
to allocate 16777216 bytes align=0x1000000 nid=1 from=0xfffffffffffffff
max_addr=0x0
[ 0.000000]
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.4.0-30-generic
#49-Ubuntu
[ 0.000000] Call Trace:
[ 0.000000] [c0000000015b39d0] [c000000000af955c] dump_stack+0xb0/0xf0
(unreliable)
[ 0.000000] [c0000000015b3a10] [c000000000af5790] panic+0x100/0x2c0
[ 0.000000] [c0000000015b3aa0] [c000000000ed238c]
memblock_virt_alloc_try_nid+0xc0/0xe8
[ 0.000000] [c0000000015b3b30] [c0000000002db69c]
__earlyonly_bootmem_alloc.constprop.2+0x50/0x74
[ 0.000000] [c0000000015b3b70] [c000000000afc5fc] vmemmap_populate+0xf8/0x250
[ 0.000000] [c0000000015b3c40] [c000000000afdfa8]
sparse_mem_map_populate+0x38/0x64
[ 0.000000] [c0000000015b3c70] [c000000000ed4234] sparse_init+0x1d4/0x298
[ 0.000000] [c0000000015b3d30] [c000000000eb3604] initmem_init+0xabc/0xd68
[ 0.000000] [c0000000015b3e50] [c000000000eab418] setup_arch+0x270/0x300
[ 0.000000] [c0000000015b3f00] [c000000000ea3ae4] start_kernel+0xc4/0x558
[ 0.000000] [c0000000015b3f90] [c000000000008c6c] start_here_common+0x20/0xa8
[ 0.000000] ---[ end Kernel panic - not syncing:
memblock_virt_alloc_try_nid: Failed to allocate 16777216 bytes align=0x1000000
nid=1 from=0xfffffffffffffff max_addr=0x0
[ 0.000000]
Regards
Praveen
== Comment: #1 - Praveen K. Pandey <[email protected]> -
2016-07-17 02:40:23 ==
== Comment: #14 - SRIKAR DRONAMRAJU <[email protected]> - 2016-08-31
11:02:28 ==
V3 was posted upstream at
http://lkml.kernel.org/r/[email protected].
That should atleast solve the problem (atleast it wouldnt panic/hang on
triggering fadump)
The patches posted were on top of 4.8-rc3 and apply cleanly on v4.4
I am not sure what is the kernel targeted for 16.10. I hear its going to be
based on v4.8
Once we know which kernel version ubuntu is targeting we can backport the
patchset accordingly.
== Comment: #18 - Gary M. Gaydos <[email protected]> - 2016-09-14 16:56:11 ==
Hi Canonical: Per this comment with patch set link, this bug appears to be
fixed using the 4.40-34 kernel. Of course the 16.10 release will use a newer
kernel.
V3 was posted upstream at http://lkml.kernel.org/r/1472476010-4709-1
[email protected].
That should atleast solve the problem (atleast it wouldnt panic/hang on
triggering fadump)
The patches posted were on top of 4.8-rc3 and apply cleanly on v4.4
I am not sure what is the kernel targeted for 16.10. I hear its going to be
based on v4.8
Once we know which kernel version ubuntu is targeting we can backport the
patchset accordingly.
Exposing a comment from test that was previously private:
(In reply to comment #16)
> Hi Praveen,
>
> I have applied the patches to the Yakkety kernel source and built the *.deb
> files. I have kept them on powerdev.in.ibm.com. Have sent you the access
> details over email
Hi latha ,
Thanks i tried with patched kernel and seems me issue is fixed . able
to capture FAdump .
Log:
root@ltc-brazos1:~# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinux-4.4.0-34-generic
root=UUID=bfdd4041-1b2f-42b1-b202-2c09f781bbcc ro fadump=on quiet splash
fadump=on crashkernel=384M-:128M
root@ltc-brazos1:~#
root@ltc-brazos1:/var/crash# ls
201609140950 kexec_cmd linux-image-4.4.0-34-generic-201609140950.crash
root@ltc-brazos1:/var/crash# cd 201609140950
root@ltc-brazos1:/var/crash/201609140950# ls
dmesg.201609140950 dump.201609140950
root@ltc-brazos1:/var/crash/201609140950#
Regards
Praveen
== Comment: #20 - Hari Krishna Bathini <[email protected]> - 2016-09-23
03:49:36 ==
Mirror the bug so Canonical can pick the fix patches.
Srikar, can you please provide the upstream commit ids of the fix patches..
Thanks
Hari
== Comment: #21 - Hari Krishna Bathini <[email protected]> - 2016-09-23
03:59:17 ==
(In reply to comment #14)
> V3 was posted upstream at
> http://lkml.kernel.org/r/[email protected].
> ibm.com.
>
> That should atleast solve the problem (atleast it wouldnt panic/hang on
> triggering fadump)
>
> The patches posted were on top of 4.8-rc3 and apply cleanly on v4.4
> I am not sure what is the kernel targeted for 16.10. I hear its going to be
> based on v4.8
Yeah. 16.10 -proposed now has v4.8 based kernel..
Thanks
Hari
** Affects: linux (Ubuntu)
Importance: Undecided
Assignee: Taco Screen team (taco-screen-team)
Status: New
** Tags: architecture-ppc64le bugnameltc-143827 severity-critical
targetmilestone-inin1610
** Tags added: architecture-ppc64le bugnameltc-143827 severity-critical
targetmilestone-inin1610
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1627036
Title:
In Ubuntu16.10:Fadump fails as Kernel panic reported while
dumping-,console got hung on 32TB Brazos System (kdump)
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1627036/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs