Public bug reported:

== Comment: #0 - Praveen K. Pandey <[email protected]> - 2016-07-17 
02:37:31 ==
Hi 

 In Ubuntu16.10 I  I tried fadump in Brazos system (32TB Memory and 192
core) , when trigger panic in kernel panic occur and console got hung.

Reproducible Step:

1- Install Ubuntu16.10
2- boot system with 31TB and 192 Core 
3- configure fadump in system 
4- verify fadump in system that it is running 
5- Trigger panic in system

Actual Result

Not able  to take Fadump , kernel panic and console got hung

Expected Result

Fadump will be captured

Log:

root@ltc-brazos1:~# kdump-config show
DUMP_MODE:        fadump
USE_KDUMP:        1
KDUMP_SYSCTL:     kernel.panic_on_oops=1
KDUMP_COREDIR:    /var/crash
   /var/lib/kdump/vmlinuz: symbolic link to /boot/vmlinux-4.4.0-30-generic
kdump initrd: 
   /var/lib/kdump/initrd.img: symbolic link to 
/var/lib/kdump/initrd.img-4.4.0-30-generic
current state:    ready to fadump
root@ltc-brazos1:~# 

root@ltc-brazos1:~# cat /proc/cmdline 
BOOT_IMAGE=/boot/vmlinux-4.4.0-30-generic 
root=UUID=516c4b1b-6700-4b55-bd37-d61c4c5af6af ro quiet splash fadump=on 
fadump_reserve_mem=4096M crashkernel=4096M
root@ltc-brazos1:~# 

ltc-brazos1 login: [  442.749993] sysrq: SysRq : Trigger a crash                
                                                                                
            
[  442.750031] Unable to handle kernel paging request for data at address 
0x00000000                                                                      
                  
[  442.750037] Faulting instruction address: 0xc000000000670014                 
                                                                                
            
[  442.750043] Oops: Kernel access of bad area, sig: 11 [#1]                    
                                                                                
            
[  442.750047] SMP NR_CPUS=2048 NUMA pSeries                                    
                                                                                
            
[  442.750053] Modules linked in: pseries_rng btrfs xor raid6_pq rtc_generic 
sunrpc autofs4 ses enclosure ipr                                                
               
[  442.750068] CPU: 157 PID: 403890 Comm: bash Not tainted 4.4.0-30-generic 
#49-Ubuntu                                                                      
                
[  442.750074] task: c00003f97b0af640 ti: c00003f97b104000 task.ti: 
c00003f97b104000                                                                
                        
[  442.750079] NIP: c000000000670014 LR: c0000000006710c8 CTR: c00000000066ffe0 
                                                                                
            
[  442.750083] REGS: c00003f97b107990 TRAP: 0300   Not tainted  
(4.4.0-30-generic)                                                              
                            
[  442.750088] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28242222  XER: 
00000001                                                                        
            
[  442.750100] CFAR: c000000000008468 DAR: 0000000000000000 DSISR: 42000000 
SOFTE: 1                                                                        
                
GPR00: c0000000006710c8 c00003f97b107c10 c0000000015b5d00 0000000000000063      
                                                                                
            
GPR04: c00001faba749c50 c00001faba75b4e0 c0001f3efe7c0000 0000000000000313 
GPR08: 0000000000000007 0000000000000001 0000000000000000 c0001f3efe7cecb8 
GPR12: c00000000066ffe0 c00000000bc9d380 ffffffffffffffff 0000000022000000 
GPR16: 0000000010170dc8 000001001ef401d8 0000000010140f58 00000000100c7570 
GPR20: 0000000000000000 000000001017dd58 0000000010153618 000000001017b608 
GPR24: 00003ffff7c9e7b4 0000000000000001 c0000000014f8e58 0000000000000004 
GPR28: c0000000014f9218 0000000000000063 c0000000014b11dc 0000000000000000 
[  442.750165] NIP [c000000000670014] sysrq_handle_crash+0x34/0x50
[  442.750170] LR [c0000000006710c8] __handle_sysrq+0xe8/0x270
[  442.750174] Call Trace:
[  442.750179] [c00003f97b107c10] [c000000000e08f28] 
_fw_tigon_tg3_bin_name+0x2ce58/0x342b0 (unreliable)
[  442.750186] [c00003f97b107c30] [c0000000006710c8] __handle_sysrq+0xe8/0x270
[  442.750192] [c00003f97b107cd0] [c000000000671868] 
write_sysrq_trigger+0x78/0xa0
[  442.750199] [c00003f97b107d00] [c00000000037ae30] proc_reg_write+0xb0/0x110
[  442.750205] [c00003f97b107d50] [c0000000002e186c] __vfs_write+0x6c/0xe0
[  442.750210] [c00003f97b107d90] [c0000000002e25a0] vfs_write+0xc0/0x230
[  442.750216] [c00003f97b107de0] [c0000000002e35dc] SyS_write+0x6c/0x110
[  442.750222] [c00003f97b107e30] [c000000000009204] system_call+0x38/0xb4
[  442.750226] Instruction dump:
[  442.750229] 38425d20 7c0802a6 f8010010 f821ffe1 60000000 60000000 3d220019 
394931e4 
[  442.750238] 39200001 912a0000 7c0004ac 39400000 <992a0000> 38210020 e8010010 
7c0803a6 
[  442.750248] ---[ end trace ff61e1bc4dd59a42 ]---
[  442.752585] 


Loading Linux 4.4.0-30-generic ...
Loading initial ramdisk ...
OF stdout device is: /vdevice/vty@30000000
Preparing to boot Linux version 4.4.0-30-generic (buildd@bos01-ppc64el-023) 
(gcc version 5.3.1 20160413 (Ubuntu/IBM 5.3.1-14ubuntu2.1) ) #49-Ubuntu SMP Fri 
Jul 1 10:00:36 UTC 2016 (Ubuntu 4.4.0-30.49-generic 4.4.13)
Detected machine type: 0000000000000101
Max number of cores passed to firmware: 256 (NR_CPUS = 2048)
Calling ibm,client-architecture-support... done
command line: BOOT_IMAGE=/boot/vmlinux-4.4.0-30-generic 
root=UUID=516c4b1b-6700-4b55-bd37-d61c4c5af6af ro quiet splash fadump=on 
fadump_reserve_mem=4096M crashkernel=4096M
Ignoring mem=0000000100000000 >= ram_top.
memory layout at init:
  memory_limit : 0000000000000000 (16 MB aligned)
  alloc_bottom : 000000000e020000
  alloc_top    : 0000000010000000
  alloc_top_hi : 0000000010000000
  rmo_top      : 0000000010000000
  ram_top      : 0000000010000000
instantiating rtas at 0x000000000e9e0000... done
prom_hold_cpus: skipped
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x000000000e030000 -> 0x000000000e0319a4
Device tree struct  0x000000000e040000 -> 0x000000000e640000
Quiescing Open Firmware ...
Booting Linux via __start() ...
 -> smp_release_cpus()
spinning_secondaries = 1535
 <- smp_release_cpus()
 <- setup_system()
[    0.000000] Kernel panic - not syncing: memblock_virt_alloc_try_nid: Failed 
to allocate 16777216 bytes align=0x1000000 nid=1 from=0xfffffffffffffff 
max_addr=0x0
[    0.000000] 
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.4.0-30-generic 
#49-Ubuntu
[    0.000000] Call Trace:
[    0.000000] [c0000000015b39d0] [c000000000af955c] dump_stack+0xb0/0xf0 
(unreliable)
[    0.000000] [c0000000015b3a10] [c000000000af5790] panic+0x100/0x2c0
[    0.000000] [c0000000015b3aa0] [c000000000ed238c] 
memblock_virt_alloc_try_nid+0xc0/0xe8
[    0.000000] [c0000000015b3b30] [c0000000002db69c] 
__earlyonly_bootmem_alloc.constprop.2+0x50/0x74
[    0.000000] [c0000000015b3b70] [c000000000afc5fc] vmemmap_populate+0xf8/0x250
[    0.000000] [c0000000015b3c40] [c000000000afdfa8] 
sparse_mem_map_populate+0x38/0x64
[    0.000000] [c0000000015b3c70] [c000000000ed4234] sparse_init+0x1d4/0x298
[    0.000000] [c0000000015b3d30] [c000000000eb3604] initmem_init+0xabc/0xd68
[    0.000000] [c0000000015b3e50] [c000000000eab418] setup_arch+0x270/0x300
[    0.000000] [c0000000015b3f00] [c000000000ea3ae4] start_kernel+0xc4/0x558
[    0.000000] [c0000000015b3f90] [c000000000008c6c] start_here_common+0x20/0xa8
[    0.000000] ---[ end Kernel panic - not syncing: 
memblock_virt_alloc_try_nid: Failed to allocate 16777216 bytes align=0x1000000 
nid=1 from=0xfffffffffffffff max_addr=0x0
[    0.000000] 

Regards
Praveen

== Comment: #1 - Praveen K. Pandey <[email protected]> -
2016-07-17 02:40:23 ==


== Comment: #14 - SRIKAR DRONAMRAJU <[email protected]> - 2016-08-31 
11:02:28 ==
V3 was posted upstream at 
http://lkml.kernel.org/r/[email protected].

That should atleast solve the problem (atleast it wouldnt panic/hang on
triggering fadump)

The patches posted were on top of 4.8-rc3 and apply cleanly on v4.4
I am not sure what is the kernel targeted for 16.10.  I hear its going to be 
based on v4.8
Once we know which kernel version ubuntu is targeting we can backport the 
patchset accordingly.

== Comment: #18 - Gary M. Gaydos <[email protected]> - 2016-09-14 16:56:11 ==
Hi Canonical:  Per this comment with patch set link, this bug appears to be 
fixed using the 4.40-34 kernel.  Of course the 16.10 release will use a newer 
kernel.

V3 was posted upstream at http://lkml.kernel.org/r/1472476010-4709-1
[email protected].

That should atleast solve the problem (atleast it wouldnt panic/hang on
triggering fadump)

The patches posted were on top of 4.8-rc3 and apply cleanly on v4.4
I am not sure what is the kernel targeted for 16.10.  I hear its going to be 
based on v4.8
Once we know which kernel version ubuntu is targeting we can backport the 
patchset accordingly.


Exposing a comment from test that was previously private:
(In reply to comment #16)
> Hi Praveen, 
> 
> I have applied the patches to the Yakkety kernel source and built the *.deb
> files. I have kept them on powerdev.in.ibm.com. Have sent you the access
> details over email

Hi latha ,

  Thanks i tried with patched kernel and seems me issue is fixed . able
to capture FAdump .

Log:

root@ltc-brazos1:~# cat /proc/cmdline 
BOOT_IMAGE=/boot/vmlinux-4.4.0-34-generic 
root=UUID=bfdd4041-1b2f-42b1-b202-2c09f781bbcc ro fadump=on quiet splash 
fadump=on crashkernel=384M-:128M
root@ltc-brazos1:~# 

 root@ltc-brazos1:/var/crash# ls
201609140950  kexec_cmd  linux-image-4.4.0-34-generic-201609140950.crash
root@ltc-brazos1:/var/crash# cd 201609140950
root@ltc-brazos1:/var/crash/201609140950# ls
dmesg.201609140950  dump.201609140950
root@ltc-brazos1:/var/crash/201609140950# 

Regards
Praveen

== Comment: #20 - Hari Krishna Bathini <[email protected]> - 2016-09-23 
03:49:36 ==
Mirror the bug so Canonical can pick the fix patches.
Srikar, can you please provide the upstream commit ids of the fix patches..

Thanks
Hari

== Comment: #21 - Hari Krishna Bathini <[email protected]> - 2016-09-23 
03:59:17 ==
(In reply to comment #14)
> V3 was posted upstream at
> http://lkml.kernel.org/r/[email protected].
> ibm.com.
> 
> That should atleast solve the problem (atleast it wouldnt panic/hang on
> triggering fadump)
> 
> The patches posted were on top of 4.8-rc3 and apply cleanly on v4.4
> I am not sure what is the kernel targeted for 16.10.  I hear its going to be
> based on v4.8

Yeah. 16.10 -proposed now has v4.8 based kernel..

Thanks
Hari

** Affects: linux (Ubuntu)
     Importance: Undecided
     Assignee: Taco Screen team (taco-screen-team)
         Status: New


** Tags: architecture-ppc64le bugnameltc-143827 severity-critical 
targetmilestone-inin1610

** Tags added: architecture-ppc64le bugnameltc-143827 severity-critical
targetmilestone-inin1610

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1627036

Title:
  In Ubuntu16.10:Fadump fails as Kernel panic reported while
  dumping-,console got hung on 32TB Brazos System (kdump)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1627036/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to