------- Comment From pavra...@in.ibm.com 2018-04-05 07:34 EDT-------
Issue is resolved in 4.15.0-15-generic kernel.

root@ltc-wspoon4:~# ppc64_cpu --smt
SMT is off

Starting Kernel crash dump capture service...
[   11.747657] kdump-tools[952]: Starting kdump-tools:  * running makedumpfile 
-c -d 31 /proc/vmcore /var/crash/201804050626/dump-incomplete
Copying data                                      : [100.0 %] \           eta: 
0s
[   27.390223] kdump-tools[952]: The kernel version is not supported.
[   27.390438] kdump-tools[952]: The makedumpfile operation may be incomplete.
[   27.390563] kdump-tools[952]: The dumpfile is saved to 
/var/crash/201804050626/dump-incomplete.
[   27.390726] kdump-tools[952]: makedumpfile Completed.
[   27.405543] kdump-tools[952]:  * kdump-tools: saved vmcore in 
/var/crash/201804050626
[   30.762418] kdump-tools[952]:  * running makedumpfile --dump-dmesg 
/proc/vmcore /var/crash/201804050626/dmesg.201804050626
[   30.802776] kdump-tools[952]: The kernel version is not supported.
[   30.802923] kdump-tools[952]: The makedumpfile operation may be incomplete.
[   30.803025] kdump-tools[952]: The dmesg log is saved to 
/var/crash/201804050626/dmesg.201804050626.
[   30.803145] kdump-tools[952]: makedumpfile Completed.
[   30.803263] kdump-tools[952]:  * kdump-tools: saved dmesg content in 
/var/crash/201804050626
[   30.888353] kdump-tools[952]: Thu, 05 Apr 2018 06:26:24 -0500
[   31.035631] kdump-tools[952]: Rebooting.
[   31.126613] reboot: Restarting system
[ 1577.265030518,5] OPAL: Reboot request...

root@ltc-wspoon4:~# ppc64_cpu --smt
SMT=2

Starting Kernel crash dump capture service...
[   13.378626] kdump-tools[952]: Starting kdump-tools:  * running makedumpfile 
-c -d 31 /proc/vmcore /var/crash/201804050631/dump-incomplete
Copying data                                      : [100.0 %] |           eta: 
0s
[   27.102530] kdump-tools[952]: The kernel version is not supported.
[   27.102659] kdump-tools[952]: The makedumpfile operation may be incomplete.
[   27.102787] kdump-tools[952]: The dumpfile is saved to 
/var/crash/201804050631/dump-incomplete.
[   27.102910] kdump-tools[952]: makedumpfile Completed.
[   27.112064] kdump-tools[952]:  * kdump-tools: saved vmcore in 
/var/crash/201804050631
[   29.632162] kdump-tools[952]:  * running makedumpfile --dump-dmesg 
/proc/vmcore /var/crash/201804050631/dmesg.201804050631
[   29.672730] kdump-tools[952]: The kernel version is not supported.
[   29.672890] kdump-tools[952]: The makedumpfile operation may be incomplete.
[   29.672997] kdump-tools[952]: The dmesg log is saved to 
/var/crash/201804050631/dmesg.201804050631.
[   29.673111] kdump-tools[952]: makedumpfile Completed.
[   29.673249] kdump-tools[952]:  * kdump-tools: saved dmesg content in 
/var/crash/201804050631
[   29.774672] kdump-tools[952]: Thu, 05 Apr 2018 06:31:40 -0500
[   29.913780] kdump-tools[952]: Rebooting.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1758206

Title:
  Ubuntu 18.04 [ WSP DD2.2 with stop4 and stop5 enabled ]: kdump fails
  to capture dump when smt=2 or off.

Status in The Ubuntu-power-systems project:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed

Bug description:
  ---Problem Description---

  Ubuntu 18.04 [ WSP DD2.2 with stop4 and stop5 enabled ]: kdump fails
  to capture dump when smt=2 or off.

  ---Environment--
  Kernel Build:  4.15.0-13-generic
  System Name :  ltc-wspoon4
  Model/Type  :  P9
  Platform    :  BML

  ---Steps to reproduce--

  1. Configure kdump.
  2. Set smt=off
  # ppc64_cpu --smt=off
  3. trigger crash.
  echo 1 > /proc/sys/kernel/sysrq
  echo "c" > /proc/sysrq-trigger

  ---Logs----

  root@ltc-wspoon4:~# dpkg -l|grep kexec
  ii  kexec-tools                         1:2.0.16-1ubuntu1                 
ppc64el      tools to support fast kexec reboots
  root@ltc-wspoon4:~# makedumpfile -v
  makedumpfile: version 1.6.3 (released on 29 Jun 2018)
  lzo   enabled
  snappy        disabled

  
  [  285.519832] [c000001fe2d83de0] [c0000000003d1898] SyS_write+0x68/0x110
  [  285.519926] [c000001fe2d83e30] [c00000000000b184] system_call+0x58/0x6c
  [  285.520007] Instruction dump:
  [  285.520053] 4bfff9f1 4bfffe50 3c4c00f0 3842e800 7c0802a6 60000000 39200001 
3d42001c 
  [  285.520158] 394a6db0 912a0000 7c0004ac 39400000 <992a0000> 4e800020 
3c4c00f0 3842e7d0 
  [  285.520261] ---[ end trace 90a666dc7ca6f0ec ]---
  [  286.525787] 
  [  286.525883] Sending IPI to other CPUs
  [  28[  401.296284048,5] OPAL: Switch to big-endian OS
  [  402.297026662,3] OPAL: CPU 0x1 not in OPAL !
  6.851284] IPI complete
  [  403.455520784,3] OPAL: CPU 0x1 not in OPAL !nce.
  [  403.455569636,5] OPAL: Switch to little-endian OS
  [  404.455711332,3] OPAL: CPU 0x1 not in OPAL !
  [  404.470276386,3] PHB#0000[0:0]: CRESET: Unexpected slot state 00000102, 
resetting...
  [  413.140065625,3] PHB#0003[0:3]: CRESET: Unexpected slot state 00000102, 
resetting...
  [  421.393193605,3] PHB#0030[8:0]: CRESET: Unexpected slot state 00000102, 
resetting...
  [  423.353977316,3] PHB#0033[8:3]: CRESET: Unexpected slot state 00000102, 
resetting...
  [  425.314547966,3] PHB#0034[8:4]: CRESET: Unexpected slot state 00000102, 
resetting...

  [    5.004718] Processor 1 is stuck.
  [   10.007584] Processor 2 is stuck.
  [   15.010425] Processor 3 is stuck.
  [   16.135550] integrity: Unable to open file: /etc/keys/x509_ima.der (-2)
  [   16.135554] integrity: Unable to open file: /etc/keys/x509_evm.der (-2)
  [   16.250952] vio vio: uevent: failed to send synthetic uevent

  
  --== Welcome to Hostboot hostboot-5fc3b52/hbicore.bin ==--

    4.52180|secure|SecureROM valid - enabling functionality
    4.53193|secure|Booting in non-secure mode.
    6.00924|Booting from SBE side 0 on master proc=00050000

  
  There could be a firmware issue there but still there is need for the below 
kernel
  patches to be included to ensure kdump kernel captures dump successfully
  when SMT is set to 2/off

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=04b9c96eae72d862726f2f4bfcec2078240c33c5
  ("powerpc/crash: Remove the test for cpu_online in the IPI callback")

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4145f358644b970fcff293c09fdcc7939e8527d2
  ("powernv/kdump: Fix cases where the kdump kernel can get HMI's")

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=910961754572a2f4c83ad7e610d180
  ("powerpc/kdump: Fix powernv build break when KEXEC_CORE=n")

  Thanks
  Hari

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1758206/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to