[Bug 1758206] Comment bridged from LTC Bugzilla

2018-04-05 Thread bugproxy
--- Comment From pavra...@in.ibm.com 2018-04-05 07:34 EDT---
Issue is resolved in 4.15.0-15-generic kernel.

root@ltc-wspoon4:~# ppc64_cpu --smt
SMT is off

Starting Kernel crash dump capture service...
[   11.747657] kdump-tools[952]: Starting kdump-tools:  * running makedumpfile 
-c -d 31 /proc/vmcore /var/crash/201804050626/dump-incomplete
Copying data  : [100.0 %] \   eta: 
0s
[   27.390223] kdump-tools[952]: The kernel version is not supported.
[   27.390438] kdump-tools[952]: The makedumpfile operation may be incomplete.
[   27.390563] kdump-tools[952]: The dumpfile is saved to 
/var/crash/201804050626/dump-incomplete.
[   27.390726] kdump-tools[952]: makedumpfile Completed.
[   27.405543] kdump-tools[952]:  * kdump-tools: saved vmcore in 
/var/crash/201804050626
[   30.762418] kdump-tools[952]:  * running makedumpfile --dump-dmesg 
/proc/vmcore /var/crash/201804050626/dmesg.201804050626
[   30.802776] kdump-tools[952]: The kernel version is not supported.
[   30.802923] kdump-tools[952]: The makedumpfile operation may be incomplete.
[   30.803025] kdump-tools[952]: The dmesg log is saved to 
/var/crash/201804050626/dmesg.201804050626.
[   30.803145] kdump-tools[952]: makedumpfile Completed.
[   30.803263] kdump-tools[952]:  * kdump-tools: saved dmesg content in 
/var/crash/201804050626
[   30.888353] kdump-tools[952]: Thu, 05 Apr 2018 06:26:24 -0500
[   31.035631] kdump-tools[952]: Rebooting.
[   31.126613] reboot: Restarting system
[ 1577.265030518,5] OPAL: Reboot request...

root@ltc-wspoon4:~# ppc64_cpu --smt
SMT=2

Starting Kernel crash dump capture service...
[   13.378626] kdump-tools[952]: Starting kdump-tools:  * running makedumpfile 
-c -d 31 /proc/vmcore /var/crash/201804050631/dump-incomplete
Copying data  : [100.0 %] |   eta: 
0s
[   27.102530] kdump-tools[952]: The kernel version is not supported.
[   27.102659] kdump-tools[952]: The makedumpfile operation may be incomplete.
[   27.102787] kdump-tools[952]: The dumpfile is saved to 
/var/crash/201804050631/dump-incomplete.
[   27.102910] kdump-tools[952]: makedumpfile Completed.
[   27.112064] kdump-tools[952]:  * kdump-tools: saved vmcore in 
/var/crash/201804050631
[   29.632162] kdump-tools[952]:  * running makedumpfile --dump-dmesg 
/proc/vmcore /var/crash/201804050631/dmesg.201804050631
[   29.672730] kdump-tools[952]: The kernel version is not supported.
[   29.672890] kdump-tools[952]: The makedumpfile operation may be incomplete.
[   29.672997] kdump-tools[952]: The dmesg log is saved to 
/var/crash/201804050631/dmesg.201804050631.
[   29.673111] kdump-tools[952]: makedumpfile Completed.
[   29.673249] kdump-tools[952]:  * kdump-tools: saved dmesg content in 
/var/crash/201804050631
[   29.774672] kdump-tools[952]: Thu, 05 Apr 2018 06:31:40 -0500
[   29.913780] kdump-tools[952]: Rebooting.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1758206

Title:
  Ubuntu 18.04 [ WSP DD2.2 with stop4 and stop5 enabled ]: kdump fails
  to capture dump when smt=2 or off.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1758206/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1758206] Comment bridged from LTC Bugzilla

2018-03-29 Thread bugproxy
--- Comment From pavra...@in.ibm.com 2018-03-30 01:14 EDT---
Tested again with given kernel, dump capture is successful with smt=2 and 
smt=off.

Sorry fr the wrong update in previous comment, not sure what i had
missed yesterday.

root@ltc-wspoon4:~# uname -a
Linux ltc-wspoon4 4.15.0-12-generic #13~lp1758206 SMP Tue Mar 27 15:20:59 UTC 
2018 ppc64le ppc64le ppc64le GNU/Linux
root@ltc-wspoon4:~# ppc64_cpu --smt=off
root@ltc-wspoon4:~#
root@ltc-wspoon4:~# echo 1 > /proc/sys/kernel/sysrq
root@ltc-wspoon4:~# echo "c" > /proc/sysrq-trigger
[ 1424.806117] sysrq: SysRq : Trigger a crash
[ 1424.806163] Unable to handle kernel paging request for data at address 
0x
[ 1424.806267] Faulting instruction address: 0xc07ec768
[ 1424.806352] Oops: Kernel access of bad area, sig: 11 [#1]
[ 1424.806424] LE SMP NR_CPUS=2048 NUMA PowerNV
[ 1424.806483] Modules linked in: idt_89hpesx(E) at24 ofpart uio_pdrv_genirq 
cmdlinepart powernv_flash uio mtd opal_prd ipmi_powernv ipmi_devintf ibmpowernv 
vmx_crypto ipmi_msghandler crct10dif_vpmsum sch_fq_codel ip_tables x_tables 
autofs4 ast i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt 
fb_sys_fops ahci crc32c_vpmsum drm tg3 libahci
[ 1424.806828] CPU: 0 PID: 3110 Comm: bash Tainted: GE
4.15.0-12-generic #13~lp1758206
[ 1424.806963] NIP:  c07ec768 LR: c07ed6a8 CTR: c07ec740
[ 1424.807075] REGS: c01fce3d39f0 TRAP: 0300   Tainted: GE 
(4.15.0-12-generic)
[ 1424.807211] MSR:  90009033   CR: 2822  
XER: 2004
[ 1424.807325] CFAR: c07ed6a4 DAR:  DSISR: 4200 
SOFTE: 1
[ 1424.807325] GPR00: c07ed6a8 c01fce3d3c70 c16eaf00 
0063
[ 1424.807325] GPR04: c01ff6fbce18 c01ff6fd4368 90009033 
000a
[ 1424.807325] GPR08: 0007 0001  
90001003
[ 1424.807325] GPR12: c07ec740 c7a2 06127f00ae48 

[ 1424.807325] GPR16: 06124f78e9f0 06124f821998 06124f8219d0 
06124f858204
[ 1424.807325] GPR20:  0001  
7fffd6e57524
[ 1424.807325] GPR24: 7fffd6e57520 06124f85afc4 c15e9968 
0002
[ 1424.807325] GPR28: 0063 0004 c1572a9c 
c15e9d08
[ 1424.808272] NIP [c07ec768] sysrq_handle_crash+0x28/0x30
[ 1424.808364] LR [c07ed6a8] __handle_sysrq+0xf8/0x2c0
[ 1424.808417] Call Trace:
[ 1424.808468] [c01fce3d3c70] [c07ed688] __handle_sysrq+0xd8/0x2c0 
(unreliable)
[ 1424.808582] [c01fce3d3d10] [c07edeb4] 
write_sysrq_trigger+0x64/0x90
[ 1424.808690] [c01fce3d3d40] [c047dfe8] proc_reg_write+0x88/0xd0
[ 1424.808782] [c01fce3d3d70] [c03d131c] __vfs_write+0x3c/0x70
[ 1424.808875] [c01fce3d3d90] [c03d1578] vfs_write+0xd8/0x220
[ 1424.808957] [c01fce3d3de0] [c03d1898] SyS_write+0x68/0x110
[ 1424.809038] [c01fce3d3e30] [c000b184] system_call+0x58/0x6c
[ 1424.809139] Instruction dump:
[ 1424.809191] 4bfff9f1 4bfffe50 3c4c00f0 3842e7c0 7c0802a6 6000 3921 
3d42001c
[ 1424.809294] 394a6db0 912a 7c0004ac 3940 <992a> 4e800020 3c4c00f0 
3842e790
[ 1424.809399] ---[ end trace a6b92894072107e0 ]---
[ 1425.814557]
[ 1425.814659] Sending IPI to other CPUs
[ 1427[ 1827.188061287,5] OPAL: Switch to big-endian OS
.111853] IPI complete
[ 1428[ 1830.496187306,5] OPAL: Switch to little-endian OS
[ 1832.313865861,3] PHB#[0:0]: CRESET: Unexpected slot state 0102, 
resetting...
[ 1840.498727171,3] PHB#0003[0:3]: CRESET: Unexpected slot state 0102, 
resetting...
[ 1849.245109062,3] PHB#0030[8:0]: CRESET: Unexpected slot state 0102, 
resetting...
[ 1851.209060452,3] PHB#0033[8:3]: CRESET: Unexpected slot state 0102, 
resetting...
[ 1853.170614858,3] PHB#0034[8:4]: CRESET: Unexpected slot state 0102, 
resetting...
.808156] kexec: Starting switchover sequence.
[1.199857] integrity: Unable to open file: /etc/keys/x509_ima.der (-2)
[1.199861] integrity: Unable to open file: /etc/keys/x509_evm.der (-2)
[1.286500] vio vio: uevent: failed to send synthetic uevent
/dev/sdb2: recovering journal
/dev/sdb2: clean, 163655/61054976 files, 17123931/244188416 blocks
[6.018312] vio vio: uevent: failed to send synthetic uevent
[  OK  ] Started Show Plymouth Boot Screen.
plymouth-start.service
[  OK  ] Started Forward Password Requests to Plymouth Directory Watch.
[  OK  ] Reached target Local Encrypted Volumes.
[  OK  ] Started Network Service.
systemd-networkd.service
Starting Wait for Network to be Configured...
[  OK  ] Reached target Network.
[7.934300] PKCS#7 signature not signed with a trusted key
[7.934373] PKCS#7 signature not signed with a trusted key
[7.935026] PKCS#7 signature not signed with a trusted key
[7.935470] PKCS#7 signature not signed 

[Bug 1758206] Comment bridged from LTC Bugzilla

2018-03-29 Thread bugproxy
--- Comment From pavra...@in.ibm.com 2018-03-29 11:31 EDT---
(In reply to comment #10)
> I built a Bionic test kernel with the three commits mentioned in the bug
> description.  The test kernel can be downloaded from:
> http://kernel.ubuntu.com/~jsalisbury/lp1758206
>
> Can you test this kernel and see if it resolves this bug?
>
> Note, to test this kernel, you need to install both the linux-image and
> linux-image-extra .deb packages.
>
> Thanks in advance!

Tried with given kernel, kexec still failed. Please find logs below.

root@ltc-wspoon4:~# ppc64_cpu --smt
SMT is off
root@ltc-wspoon4:~# kdump-config show
DUMP_MODE:kdump
USE_KDUMP:1
KDUMP_SYSCTL: kernel.panic_on_oops=1
KDUMP_COREDIR:/var/crash
crashkernel addr:
/var/lib/kdump/vmlinuz: symbolic link to /boot/vmlinux-4.15.0-12-generic
kdump initrd:
/var/lib/kdump/initrd.img: symbolic link to 
/var/lib/kdump/initrd.img-4.15.0-12-generic
current state:ready to kdump

kexec command:
/sbin/kexec -p --command-line="root=UUID=0266024d-8ea3-4132-ad62-b49befd6f8d9 
ro quiet splash nr_cpus=1 systemd.unit=kdump-tools.service irqpoll noirqdistrib 
nousb" --initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz
root@ltc-wspoon4:~# echo "c" > /proc/sysrq-trigger
[  951.567597] sysrq: SysRq : This sysrq operation is disabled.
root@ltc-wspoon4:~# echo 1 > /proc/sys/kernel/sysrq
root@ltc-wspoon4:~# echo "c" > /proc/sysrq-trigger
[  968.396522] sysrq: SysRq : Trigger a crash
[  968.396558] Unable to handle kernel paging request for data at address 
0x
[  968.396602] Faulting instruction address: 0xc07ec768
[  968.396640] Oops: Kernel access of bad area, sig: 11 [#1]
[  968.396670] LE SMP NR_CPUS=2048 NUMA PowerNV
[  968.396703] Modules linked in: idt_89hpesx(E) at24 uio_pdrv_genirq ofpart 
cmdlinepart powernv_flash mtd uio ibmpowernv ipmi_powernv vmx_crypto 
ipmi_devintf ipmi_msghandler opal_prd crct10dif_vpmsum sch_fq_codel ip_tables 
x_tables autofs4 ast i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect 
sysimgblt fb_sys_fops ahci crc32c_vpmsum drm tg3 libahci
[  968.396893] CPU: 28 PID: 3086 Comm: bash Tainted: GE
4.15.0-12-generic #13~lp1758206
[  968.396944] NIP:  c07ec768 LR: c07ed6a8 CTR: c07ec740
[  968.396989] REGS: c54fb9f0 TRAP: 0300   Tainted: GE 
(4.15.0-12-generic)
[  968.397040] MSR:  90009033   CR: 2822  
XER: 2004
[  968.397090] CFAR: c07ed6a4 DAR:  DSISR: 4200 
SOFTE: 1
[  968.397090] GPR00: c07ed6a8 c54fbc70 c16eaf00 
0063
[  968.397090] GPR04: c01ff76bce18 c01ff76d4368 90009033 
000a
[  968.397090] GPR08: 0007 0001  
90001003
[  968.397090] GPR12: c07ec740 c7a33400 0a463c88ae48 

[  968.397090] GPR16: 0a462439e9f0 0a4624431998 0a46244319d0 
0a4624468204
[  968.397090] GPR20:  0001  
79ecd164
[  968.397090] GPR24: 79ecd160 0a462446afc4 c15e9968 
0002
[  968.397090] GPR28: 0063 0007 c1572a9c 
c15e9d08
[  968.397486] NIP [c07ec768] sysrq_handle_crash+0x28/0x30
[  968.397524] LR [c07ed6a8] __handle_sysrq+0xf8/0x2c0
[  968.397554] Call Trace:
[  968.397571] [c54fbc70] [c07ed688] __handle_sysrq+0xd8/0x2c0 
(unreliable)
[  968.397618] [c54fbd10] [c07edeb4] 
write_sysrq_trigger+0x64/0x90
[  968.397664] [c54fbd40] [c047dfe8] proc_reg_write+0x88/0xd0
[  968.397703] [c54fbd70] [c03d131c] __vfs_write+0x3c/0x70
[  968.397742] [c54fbd90] [c03d1578] vfs_write+0xd8/0x220
[  968.397781] [c54fbde0] [c03d1898] SyS_write+0x68/0x110
[  968.397821] [c54fbe30] [c000b184] system_call+0x58/0x6c
[  968.397857] Instruction dump:
[  968.397881] 4bfff9f1 4bfffe50 3c4c00f0 3842e7c0 7c0802a6 6000 3921 
3d42001c
[  968.397929] 394a6db0 912a 7c0004ac 3940 <992a> 4e800020 3c4c00f0 
3842e790
[  968.397979] ---[ end trace 42b5936ebd77f0df ]---
[  969.403420]
[  969.403499] Sending IPI to other CPUs
[  970.[ 9304.282854548,5] OPAL: Switch to big-endian OS
699527] IPI c[ 9308.106771743,5] OPAL: Switch to little-endian OS
[ 9309.438684420,3] PHB#[0:0]: CRESET: Unexpected slot state 0102, 
resetting...
[ 9310.039758053,3] PHB#[0:0]: Timeout waiting for DLP PG reset !
[ 9310.039836165,3] PHB#[0:0]: Initialization failed
[ 9312.102310864,3] PHB#0001[0:1]: Timeout waiting for DLP PG reset !
[ 9312.102386083,3] PHB#0001[0:1]: Initialization failed
[ 9314.164868252,3] PHB#0002[0:2]: Timeout waiting for DLP PG reset !
[ 9314.165418307,3] PHB#0002[0:2]: Initialization failed
[ 9316.116455526,3] PHB#0003[0:3]: CRESET: Unexpected slot state 0102, 
resetting...

[Bug 1758206] Comment bridged from LTC Bugzilla

2018-03-25 Thread bugproxy
--- Comment From kalsh...@in.ibm.com 2018-03-25 21:18 EDT---
Can we get patched kernel for test to try this out.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1758206

Title:
  Ubuntu 18.04 [ WSP DD2.2 with stop4 and stop5 enabled ]: kdump fails
  to capture dump when smt=2 or off.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1758206/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs