Public bug reported:

Problem  Description
==========================
On talclp1, I enabled kdump. But kdump failed and it drop to BusyBox.

root@talclp1:~# echo c> /proc/sysrq-trigger
[  132.643690] sysrq: SysRq : Trigger a crash
[  132.643739] Unable to handle kernel paging request for data at address 
0x00000000
[  132.643745] Faulting instruction address: 0xc0000000005c28f4
[  132.643749] Oops: Kernel access of bad area, sig: 11 [#1]
[  132.643753] SMP NR_CPUS=2048 NUMA pSeries
[  132.643758] Modules linked in: fuse ufs qnx4 hfsplus hfs minix ntfs msdos 
jfs rpadlpar_io rpaphp rpcsec_gss_krb5 nfsv4 dccp_diag cifs nfs dns_resolver 
dccp tcp_diag fscache udp_diag inet_diag unix_diag af_packet_diag netlink_diag 
binfmt_misc xfs libcrc32c pseries_rng rng_core ghash_generic gf128mul 
vmx_crypto sg nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables 
autofs4 ext4 crc16 jbd2 fscrypto mbcache crc32c_generic btrfs xor raid6_pq 
dm_round_robin sr_mod sd_mod cdrom ses enclosure scsi_transport_sas ibmveth 
crc32c_vpmsum ipr scsi_dh_emc scsi_dh_rdac scsi_dh_alua dm_multipath dm_mod
[  132.643819] CPU: 49 PID: 10174 Comm: bash Not tainted 4.8.0-15-generic 
#16-Ubuntu
[  132.643824] task: c000000111767080 task.stack: c0000000d82e0000
[  132.643828] NIP: c0000000005c28f4 LR: c0000000005c39d8 CTR: c0000000005c28c0
[  132.643832] REGS: c0000000d82e3990 TRAP: 0300   Not tainted  
(4.8.0-15-generic)
[  132.643836] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28242422  XER: 
00000001
[  132.643848] CFAR: c0000000000087d0 DAR: 0000000000000000 DSISR: 42000000 
SOFTE: 1
GPR00: c0000000005c39d8 c0000000d82e3c10 c000000000f67b00 0000000000000063
GPR04: c00000011d04a9b8 c00000011d05f7e0 c00000047fb00000 0000000000015998
GPR08: 0000000000000007 0000000000000001 0000000000000000 0000000000000001
GPR12: c0000000005c28c0 c000000007b4b900 ffffffffffffffff 0000000022000000
GPR16: 0000000010170dc8 000001002b566368 0000000010140f58 00000000100c7570
GPR20: 0000000000000000 000000001017dd58 0000000010153618 000000001017b608
GPR24: 00003ffffe87a294 0000000000000001 c000000000ebff60 0000000000000004
GPR28: c000000000ec0320 0000000000000063 c000000000e72a90 0000000000000000
[  132.643906] NIP [c0000000005c28f4] sysrq_handle_crash+0x34/0x50
[  132.643911] LR [c0000000005c39d8] __handle_sysrq+0xe8/0x280
[  132.643914] Call Trace:
[  132.643917] [c0000000d82e3c10] [c000000000a245e8] 0xc000000000a245e8 
(unreliable)
[  132.643923] [c0000000d82e3c30] [c0000000005c39d8] __handle_sysrq+0xe8/0x280
[  132.643928] [c0000000d82e3cd0] [c0000000005c4188] 
write_sysrq_trigger+0x78/0xa0
[  132.643935] [c0000000d82e3d00] [c0000000003ad770] proc_reg_write+0xb0/0x110
[  132.643941] [c0000000d82e3d50] [c00000000030fc3c] __vfs_write+0x6c/0xe0
[  132.643946] [c0000000d82e3d90] [c000000000311144] vfs_write+0xd4/0x240
[  132.643950] [c0000000d82e3de0] [c000000000312e5c] SyS_write+0x6c/0x110
[  132.643957] [c0000000d82e3e30] [c0000000000095e0] system_call+0x38/0x108
[  132.643961] Instruction dump:
[  132.643963] 38425240 7c0802a6 f8010010 f821ffe1 60000000 60000000 3d220019 
3949ba60
[  132.643972] 39200001 912a0000 7c0004ac 39400000 <992a0000> 38210020 e8010010 
7c0803a6
[  132.643981] ---[ end trace eed6bbcd2c3bdfdf ]---
[  132.646105]
[  132.646176] Sending IPI to other CPUs
[  132.647490] IPI complete
I'm in purgatory
 -> smp_release_cpus()
spinning_secondaries = 104
 <- smp_release_cpus()
[    2.011346] alg: hash: Test 1 failed for crc32c-vpmsum
[    2.729254] sd 0:2:0:0: [sda] Assuming drive cache: write through
[    2.731554] sd 1:2:5:0: [sdn] Assuming drive cache: write through
[    2.739087] sd 1:2:4:0: [sdm] Assuming drive cache: write through
[    2.739089] sd 1:2:6:0: [sdo] Assuming drive cache: write through
[    2.739110] sd 1:2:7:0: [sdp] Assuming drive cache: write through
[    2.739115] sd 1:2:0:0: [sdi] Assuming drive cache: write through
[    2.739122] sd 1:2:3:0: [sdl] Assuming drive cache: write through
[    2.739123] sd 1:2:2:0: [sdk] Assuming drive cache: write through
[    2.739148] sd 1:2:1:0: [sdj] Assuming drive cache: write through
[    2.748938] sd 0:2:1:0: [sdb] Assuming drive cache: write through
[    2.748939] sd 0:2:7:0: [sdh] Assuming drive cache: write through
[    2.748940] sd 0:2:6:0: [sdg] Assuming drive cache: write through
[    2.748942] sd 0:2:2:0: [sdc] Assuming drive cache: write through
[    2.748958] sd 0:2:5:0: [sdf] Assuming drive cache: write through
[    2.748963] sd 0:2:4:0: [sde] Assuming drive cache: write through
[    2.748978] sd 0:2:3:0: [sdd] Assuming drive cache: write through
[    2.999087] device-mapper: table: 254:0: multipath: error attaching hardware 
handler
[    3.119912] device-mapper: table: 254:0: multipath: error attaching hardware 
handler
[    3.252513] device-mapper: table: 254:0: multipath: error attaching hardware 
handler
[    3.343680] device-mapper: table: 254:0: multipath: error attaching hardware 
handler
[    3.381234] device-mapper: table: 254:1: multipath: error attaching hardware 
handler
[    3.419515] device-mapper: table: 254:0: multipath: error attaching hardware 
handler
[    3.474587] device-mapper: table: 254:1: multipath: error attaching hardware 
handler
[    3.482188] device-mapper: table: 254:0: multipath: error attaching hardware 
handler
[    3.531439] device-mapper: table: 254:1: multipath: error attaching hardware 
handler
[    3.552824] device-mapper: table: 254:0: multipath: error attaching hardware 
handler
[    3.594489] device-mapper: table: 254:1: multipath: error attaching hardware 
handler
[    3.619222] device-mapper: table: 254:0: multipath: error attaching hardware 
handler
[    3.672208] device-mapper: table: 254:0: multipath: error attaching hardware 
handler
[    3.680298] device-mapper: table: 254:1: multipath: error attaching hardware 
handler
[    3.731718] device-mapper: table: 254:0: multipath: error attaching hardware 
handler
[    3.761333] device-mapper: table: 254:1: multipath: error attaching hardware 
handler
[    3.794955] device-mapper: table: 254:0: multipath: error attaching hardware 
handler
[    3.819212] device-mapper: table: 254:1: multipath: error attaching hardware 
handler
[    3.871913] device-mapper: table: 254:0: multipath: error attaching hardware 
handler
[    3.889439] device-mapper: table: 254:1: multipath: error attaching hardware 
handler
[    3.922620] device-mapper: table: 254:0: multipath: error attaching hardware 
handler
[    3.960707] device-mapper: table: 254:1: multipath: error attaching hardware 
handler
[    4.002959] device-mapper: table: 254:0: multipath: error attaching hardware 
handler
[    4.035611] device-mapper: table: 254:1: multipath: error attaching hardware 
handler
[    4.054476] device-mapper: table: 254:0: multipath: error attaching hardware 
handler
[    4.092241] device-mapper: table: 254:1: multipath: error attaching hardware 
handler
[    4.099432] device-mapper: table: 254:0: multipath: error attaching hardware 
handler
[    4.182358] device-mapper: table: 254:0: multipath: error attaching hardware 
handler
[    4.182823] device-mapper: table: 254:1: multipath: error attaching hardware 
handler
[    4.234767] device-mapper: table: 254:1: multipath: error attaching hardware 
handler
[    4.333309] device-mapper: table: 254:0: multipath: error attaching hardware 
handler
[    4.402827] device-mapper: table: 254:0: multipath: error attaching hardware 
handler


Gave up waiting for root device.  Common problems:
 - Boot args (cat /proc/cmdline)
   - Check rootdelay= (did the system wait long enough?)
   - Check root= (did the system wait for the right device?)
 - Missing modules (cat /proc/modules; ls /dev)
ALERT!  UUID=853769e5-1dc5-41be-a689-b430320d207f does not exist.  Dropping to 
a shell!


BusyBox v1.22.1 (Ubuntu 1:1.22.0-19ubuntu2) built-in shell (ash)
Enter 'help' for a list of built-in commands.

(initramfs)


== Comment: #7 - Vaishnavi Bhat <[email protected]> - 2016-10-07 05:37:53 ==
The blkid output does not show any device with 
UUID=853769e5-1dc5-41be-a689-b430320d207f
which is the root device used in the kexec command line (from kdump-config show)
kexec command:
  /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinux-4.8.0-15-generic 
root=UUID=853769e5-1dc5-41be-a689-b430320d207f ro xmon=on splash quiet irqpoll 
nr_cpus=1 nousb systemd.unit=kdump-tools.service" 
--initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz

Hence the kdump kernel is failing to boot here.

== Comment: #11 - Xue Sheng Li <[email protected]> - 2016-10-17 01:54:56 ==
recreated with -24 kernel.

root@talclp1:~# echo c > /proc/sysrq-trigger
[   72.655416] sysrq: SysRq : Trigger a crash
[   72.655458] Unable to handle kernel paging request for data at address 
0x00000000
[   72.655463] Faulting instruction address: 0xc00000000069d148
[   72.655469] Oops: Kernel access of bad area, sig: 11 [#1]
[   72.655472] SMP NR_CPUS=2048 NUMA pSeries
[   72.655477] Modules linked in: rpadlpar_io rpaphp dccp_diag dccp tcp_diag 
udp_diag inet_diag unix_diag af_packet_diag netlink_diag rpcsec_gss_krb5 nfsv4 
nfs cifs fscache binfmt_misc xfs pseries_rng vmx_crypto nfsd auth_rpcgss 
nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 btrfs xor raid6_pq 
dm_round_robin ses enclosure scsi_transport_sas bnx2x ipr mdio libcrc32c 
crc32c_vpmsum scsi_dh_emc scsi_dh_rdac scsi_dh_alua dm_multipath
[   72.655521] CPU: 25 PID: 9730 Comm: bash Not tainted 4.8.0-24-generic 
#26-Ubuntu
[   72.655525] task: c0000001d8451e00 task.stack: c0000001d8494000
[   72.655529] NIP: c00000000069d148 LR: c00000000069e198 CTR: c00000000069d120
[   72.655534] REGS: c0000001d84979f0 TRAP: 0300   Not tainted  
(4.8.0-24-generic)
[   72.655537] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28242222  XER: 
00000001
[   72.655549] CFAR: c000000000008750 DAR: 0000000000000000 DSISR: 42000000 
SOFTE: 1
GPR00: c00000000069e198 c0000001d8497c70 c000000001476700 0000000000000063
GPR04: c00000047e64aca0 c00000047e65fb40 c00000047df00000 0000000000015ed8
GPR08: 0000000000000007 0000000000000001 0000000000000000 0000000000000001
GPR12: c00000000069d120 c000000007b3e100 ffffffffffffffff 0000000022000000
GPR16: 0000000010170dc8 0000010036d36398 0000000010140f58 00000000100c7570
GPR20: 0000000000000000 000000001017dd58 0000000010153618 000000001017b608
GPR24: 00003ffff5582464 0000000000000001 c00000000138e6a0 0000000000000004
GPR28: c00000000138ea60 0000000000000063 c000000001342590 0000000000000000
[   72.655608] NIP [c00000000069d148] sysrq_handle_crash+0x28/0x30
[   72.655613] LR [c00000000069e198] __handle_sysrq+0xe8/0x280
[   72.655616] Call Trace:
[   72.655619] [c0000001d8497c70] [c00000000069e178] __handle_sysrq+0xc8/0x280 
(unreliable)
[   72.655625] [c0000001d8497d10] [c00000000069e8ec] 
write_sysrq_trigger+0x6c/0x90
[   72.655631] [c0000001d8497d40] [c0000000003a9568] proc_reg_write+0x88/0xd0
[   72.655637] [c0000001d8497d70] [c00000000030c40c] __vfs_write+0x3c/0x70
[   72.655642] [c0000001d8497d90] [c00000000030d674] vfs_write+0xd4/0x240
[   72.655647] [c0000001d8497de0] [c00000000030f1c8] SyS_write+0x68/0x110
[   72.655652] [c0000001d8497e30] [c000000000009584] system_call+0x38/0xec
[   72.655656] Instruction dump:
[   72.655658] 60000000 60000000 3c4c00de 384295e0 7c0802a6 60000000 3d22001a 
3949c8e0
[   72.655667] 39200001 912a0000 7c0004ac 39400000 <992a0000> 4e800020 3c4c00de 
384295b0
[   72.655677] ---[ end trace 43b490f085103bf5 ]---
[   72.659366]
[   72.659429] Sending IPI to other CPUs
[   72.660740] IPI complete
I'm in purgatory
 -> smp_release_cpus()
spinning_secondaries = 104
 <- smp_release_cpus()
[    1.699068] ibmveth 30000002 (unnamed net_device) (uninitialized): unable to 
change IPv4 checksum offload settings. 1 rc=4
[    1.699093] ibmveth 30000002 (unnamed net_device) (uninitialized): unable to 
change IPv6 checksum offload settings. 1 rc=4
[    1.699101] ibmveth 30000002 (unnamed net_device) (uninitialized): unable to 
change tso settings. 1 rc=4
[    2.657700] sd 0:2:1:0: [sdb] Assuming drive cache: write through
[    2.657701] sd 0:2:0:0: [sda] Assuming drive cache: write through
[    2.657781] sd 0:2:2:0: [sdc] Assuming drive cache: write through
[    2.660641] sd 0:2:7:0: [sdh] Assuming drive cache: write through
[    2.667731] sd 0:2:4:0: [sde] Assuming drive cache: write through
[    2.677685] sd 0:2:6:0: [sdg] Assuming drive cache: write through
[    2.677688] sd 0:2:5:0: [sdf] Assuming drive cache: write through
[    2.677708] sd 0:2:3:0: [sdd] Assuming drive cache: write through
[    2.697737] sd 1:2:6:0: [sdo] Assuming drive cache: write through
[    2.697743] sd 1:2:1:0: [sdj] Assuming drive cache: write through
[    2.697744] sd 1:2:4:0: [sdm] Assuming drive cache: write through
[    2.697747] sd 1:2:2:0: [sdk] Assuming drive cache: write through
[    2.697749] sd 1:2:3:0: [sdl] Assuming drive cache: write through
[    2.697753] sd 1:2:5:0: [sdn] Assuming drive cache: write through
[    2.699340] sd 1:2:7:0: [sdp] Assuming drive cache: write through
[    2.699360] sd 1:2:0:0: [sdi] Assuming drive cache: write through
[    3.350794] device-mapper: table: 252:0: multipath: error attaching hardware 
handler
[    3.471468] device-mapper: table: 252:0: multipath: error attaching hardware 
handler
[    3.540387] device-mapper: table: 252:0: multipath: error attaching hardware 
handler
[    3.628523] device-mapper: table: 252:0: multipath: error attaching hardware 
handler
[    3.657731] device-mapper: table: 252:1: multipath: error attaching hardware 
handler
[    3.733416] device-mapper: table: 252:0: multipath: error attaching hardware 
handler
[    3.752066] device-mapper: table: 252:1: multipath: error attaching hardware 
handler
[    3.808884] device-mapper: table: 252:0: multipath: error attaching hardware 
handler
[    3.838148] device-mapper: table: 252:1: multipath: error attaching hardware 
handler
[    3.919247] device-mapper: table: 252:0: multipath: error attaching hardware 
handler
[    3.950262] device-mapper: table: 252:1: multipath: error attaching hardware 
handler
[    3.997839] device-mapper: table: 252:0: multipath: error attaching hardware 
handler
[    4.007810] device-mapper: table: 252:1: multipath: error attaching hardware 
handler
[    4.082174] device-mapper: table: 252:0: multipath: error attaching hardware 
handler
[    4.089411] device-mapper: table: 252:1: multipath: error attaching hardware 
handler
[    4.162200] device-mapper: table: 252:0: multipath: error attaching hardware 
handler
[    4.202441] device-mapper: table: 252:1: multipath: error attaching hardware 
handler
[    4.252289] device-mapper: table: 252:0: multipath: error attaching hardware 
handler
[    4.279870] device-mapper: table: 252:1: multipath: error attaching hardware 
handler
[    4.311712] device-mapper: table: 252:0: multipath: error attaching hardware 
handler
[    4.348150] device-mapper: table: 252:1: multipath: error attaching hardware 
handler
[    4.402076] device-mapper: table: 252:0: multipath: error attaching hardware 
handler
[    4.432069] device-mapper: table: 252:1: multipath: error attaching hardware 
handler
[    4.487871] device-mapper: table: 252:0: multipath: error attaching hardware 
handler
[    4.518282] device-mapper: table: 252:1: multipath: error attaching hardware 
handler
[    4.573338] device-mapper: table: 252:0: multipath: error attaching hardware 
handler
[    4.599280] device-mapper: table: 252:1: multipath: error attaching hardware 
handler
[    4.632144] device-mapper: table: 252:0: multipath: error attaching hardware 
handler
[    4.671142] device-mapper: table: 252:1: multipath: error attaching hardware 
handler
[    4.713352] device-mapper: table: 252:0: multipath: error attaching hardware 
handler
[    4.782117] device-mapper: table: 252:0: multipath: error attaching hardware 
handler
[    4.890336] device-mapper: table: 252:0: multipath: error attaching hardware 
handler

== Comment: #13 - Hari Krishna Bathini <[email protected]> - 2016-10-19 
16:26:57 ==
(In reply to comment #12)
> Hi Hari,
> 
> Can you please take a look at this issue and suggest what would be the next
> step ?
> We are facing this issue with -24 kernel as well. Can this be a issue with
> kdump kernel that has missing multipath modules or some other issue ?
> 

Hi Vaishnavi,

Necessary hardware handler modules are missing in the kdump initrd.
Here is the console log of kdump kernel that says the same:

--
Begin: Loading multipath hardware handlers ... Failure: failed to load module 
scsi_dh_alua.
Failure: failed to load module scsi_dh_rdac.
Failure: failed to load module scsi_dh_emc.
--

Including this modules explicitly and rebuilding initrd for kdump, able to get 
to a point
where makedumpfile starts to capture dump but fails with:

    "get_mem_map: Can't distinguish the memory type."

which is already tracked with bug 146571

Thanks
Hari

PS1: To explicitly add modules to kdump initrd
      
      1. List the necessary modules in /var/lib/kdump/initramfs-tools/modules 
file
      2. mkinitramfs -d /var/lib/kdump/initramfs-tools -o 
/var/lib/kdump/initrd.img-$kver
      3. systemctl restart kdump-tools.service


Mirroring this bug to Canonical for their inputs if to include the missing 
hardware modules to the kdump initrd or to proceed with the workaround.

** Affects: linux (Ubuntu)
     Importance: Undecided
     Assignee: Taco Screen team (taco-screen-team)
         Status: New


** Tags: architecture-ppc64le bugnameltc-146907 severity-high 
targetmilestone-inin---

** Tags added: architecture-ppc64le bugnameltc-146907 severity-high
targetmilestone-inin---

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1635597

Title:
  Ubuntu16.10:talclp1: Kdump failed with multipath disk

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1635597/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to