[Kernel-packages] [Bug 1762844] Second attempt of host reboot_boslcp3
--- Comment on attachment From indira.pr...@in.ibm.com 2018-04-12 11:13 EDT--- Attached boslcp3 host console logs during 2nd attempt of reboot ** Attachment added: "Second attempt of host reboot_boslcp3" https://bugs.launchpad.net/bugs/1762844/+attachment/5132661/+files/Secong_attempt_of_reboot_boslcp3.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1762844 Title: ISST-LTE:KVM:Ubuntu1804:BostonLC:boslcp3: Host crashed & enters into xmon after moving to 4.15.0-15.16 kernel Status in The Ubuntu-power-systems project: Incomplete Status in linux package in Ubuntu: Incomplete Status in linux source package in Bionic: Incomplete Bug description: Problem Description: === Host crashed & enters into xmon after updating to 4.15.0-15.16 kernel kernel. Steps to re-create: == 1. boslcp3 is up with BMC:118 & PNOR: 20180330 levels 2. Installed boslcp3 with latest kernel 4.15.0-13-generic 3. Enabled "-proposed" kernel in /etc/apt/sources.list file 4. Ran sudo apt-get update & apt-get upgrade 5. root@boslcp3:~# ls /boot abi-4.15.0-13-generic retpoline-4.15.0-13-generic abi-4.15.0-15-generic retpoline-4.15.0-15-generic config-4.15.0-13-generic System.map-4.15.0-13-generic config-4.15.0-15-generic System.map-4.15.0-15-generic grub vmlinux initrd.imgvmlinux-4.15.0-13-generic initrd.img-4.15.0-13-generic vmlinux-4.15.0-15-generic initrd.img-4.15.0-15-generic vmlinux.old initrd.img.old 6. Rebooted & booted with 4.15.0-15 kernel 7. Enabled xmon by editing file "vi /etc/default/grub" and ran update-grub 8. Rebooted host. 9. Booted with 4.15.0-15 & provided root/password credentials in login prompt 10. Host crashed & enters into XMON state with 'Unable to handle kernel paging request' root@boslcp3:~# [ 66.295233] Unable to handle kernel paging request for data at address 0x8882f6ed90e9151a [ 66.295297] Faulting instruction address: 0xc038a110 cpu 0x50: Vector: 380 (Data Access Out of Range) at [c692f650] pc: c038a110: kmem_cache_alloc_node+0x2f0/0x350 lr: c038a0fc: kmem_cache_alloc_node+0x2dc/0x350 sp: c692f8d0 msr: 90009033 dar: 8882f6ed90e9151a current = 0xc698fd00 paca= 0xcfab7000 softe: 0irq_happened: 0x01 pid = 1762, comm = systemd-journal Linux version 4.15.0-15-generic (buildd@bos02-ppc64el-002) (gcc version 7.3.0 (Ubuntu 7.3.0-14ubuntu1)) #16-Ubuntu SMP Wed Apr 4 13:57:51 UTC 2018 (Ubuntu 4.15.0-15.16-generic 4.15.15) enter ? for help [c692f8d0] c0389fd4 kmem_cache_alloc_node+0x1b4/0x350 (unreliable) [c692f940] c0b2ec6c __alloc_skb+0x6c/0x220 [c692f9a0] c0b30b6c alloc_skb_with_frags+0x7c/0x2e0 [c692fa30] c0b247cc sock_alloc_send_pskb+0x29c/0x2c0 [c692fae0] c0c5705c unix_dgram_sendmsg+0x15c/0x8f0 [c692fbc0] c0b1ec64 sock_sendmsg+0x64/0x90 [c692fbf0] c0b20abc ___sys_sendmsg+0x31c/0x390 [c692fd90] c0b221ec __sys_sendmsg+0x5c/0xc0 [c692fe30] c000b184 system_call+0x58/0x6c --- Exception: c00 (System Call) at 74826f6fa9c4 SP (75dc5510) is in userspace 50:mon> 50:mon> 10. Attached Host console logs I rebooted the host just to see if it would hit the issue again and this time I didn't even get to the login prompt but it crashed in the same location: 50:mon> r R00 = c0389fd4 R16 = c000200e0b20fdc0 R01 = c000200e0b20f8d0 R17 = 0048 R02 = c16eb400 R18 = 0001fe80 R03 = 0001 R19 = R04 = 0048ca1cff37803d R20 = R05 = 0688 R21 = R06 = 0001 R22 = 0048 R07 = 0687 R23 = 4882d6e3c8b7ab55 R08 = 48ca1cff37802b68 R24 = c000200e5851df01 R09 = R25 = 8882f6ed90e67454 R10 = R26 = c0b2ec6c R11 = c0d10f78 R27 = c00ff901ee00 R12 = 2000 R28 = R13 = cfab7000 R29 = 015004c0 R14 = c000200e4c973fc8 R30 = c000200e5851df01 R15 = c000200e4c974238 R31 = c00ff901ee00 pc = c038a110 kmem_cache_alloc_node+0x2f0/0x350 cfar= c0016e1c arch_local_irq_restore+0x1c/0x90 lr = c038a0fc kmem_cache_alloc_node+0x2dc/0x350 msr = 90009033 cr = 28002844 ctr = c061e1b0 xer = trap = 380 dar = 8882f6ed90e67454 dsisr = c000200e40bd8400 50:mon> t [c000200e0b20f8d0] c0389fd4 kmem_cache_alloc_node+0x1b4/0x350 (unreliable) [c000200e0b20f940] c000
[Kernel-packages] [Bug 1762844] Second attempt of host reboot_boslcp3
--- Comment on attachment From indira.pr...@in.ibm.com 2018-04-12 11:13 EDT--- Attached boslcp3 host console logs during 2nd attempt of reboot ** Attachment added: "Second attempt of host reboot_boslcp3" https://bugs.launchpad.net/bugs/1762844/+attachment/5127933/+files/Secong_attempt_of_reboot_boslcp3.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1762844 Title: ISST-LTE:KVM:Ubuntu1804:BostonLC:boslcp3: Host crashed & enters into xmon after moving to 4.15.0-15.16 kernel Status in The Ubuntu-power-systems project: Incomplete Status in linux package in Ubuntu: Incomplete Status in linux source package in Bionic: Incomplete Bug description: Problem Description: === Host crashed & enters into xmon after updating to 4.15.0-15.16 kernel kernel. Steps to re-create: == 1. boslcp3 is up with BMC:118 & PNOR: 20180330 levels 2. Installed boslcp3 with latest kernel 4.15.0-13-generic 3. Enabled "-proposed" kernel in /etc/apt/sources.list file 4. Ran sudo apt-get update & apt-get upgrade 5. root@boslcp3:~# ls /boot abi-4.15.0-13-generic retpoline-4.15.0-13-generic abi-4.15.0-15-generic retpoline-4.15.0-15-generic config-4.15.0-13-generic System.map-4.15.0-13-generic config-4.15.0-15-generic System.map-4.15.0-15-generic grub vmlinux initrd.imgvmlinux-4.15.0-13-generic initrd.img-4.15.0-13-generic vmlinux-4.15.0-15-generic initrd.img-4.15.0-15-generic vmlinux.old initrd.img.old 6. Rebooted & booted with 4.15.0-15 kernel 7. Enabled xmon by editing file "vi /etc/default/grub" and ran update-grub 8. Rebooted host. 9. Booted with 4.15.0-15 & provided root/password credentials in login prompt 10. Host crashed & enters into XMON state with 'Unable to handle kernel paging request' root@boslcp3:~# [ 66.295233] Unable to handle kernel paging request for data at address 0x8882f6ed90e9151a [ 66.295297] Faulting instruction address: 0xc038a110 cpu 0x50: Vector: 380 (Data Access Out of Range) at [c692f650] pc: c038a110: kmem_cache_alloc_node+0x2f0/0x350 lr: c038a0fc: kmem_cache_alloc_node+0x2dc/0x350 sp: c692f8d0 msr: 90009033 dar: 8882f6ed90e9151a current = 0xc698fd00 paca= 0xcfab7000 softe: 0irq_happened: 0x01 pid = 1762, comm = systemd-journal Linux version 4.15.0-15-generic (buildd@bos02-ppc64el-002) (gcc version 7.3.0 (Ubuntu 7.3.0-14ubuntu1)) #16-Ubuntu SMP Wed Apr 4 13:57:51 UTC 2018 (Ubuntu 4.15.0-15.16-generic 4.15.15) enter ? for help [c692f8d0] c0389fd4 kmem_cache_alloc_node+0x1b4/0x350 (unreliable) [c692f940] c0b2ec6c __alloc_skb+0x6c/0x220 [c692f9a0] c0b30b6c alloc_skb_with_frags+0x7c/0x2e0 [c692fa30] c0b247cc sock_alloc_send_pskb+0x29c/0x2c0 [c692fae0] c0c5705c unix_dgram_sendmsg+0x15c/0x8f0 [c692fbc0] c0b1ec64 sock_sendmsg+0x64/0x90 [c692fbf0] c0b20abc ___sys_sendmsg+0x31c/0x390 [c692fd90] c0b221ec __sys_sendmsg+0x5c/0xc0 [c692fe30] c000b184 system_call+0x58/0x6c --- Exception: c00 (System Call) at 74826f6fa9c4 SP (75dc5510) is in userspace 50:mon> 50:mon> 10. Attached Host console logs I rebooted the host just to see if it would hit the issue again and this time I didn't even get to the login prompt but it crashed in the same location: 50:mon> r R00 = c0389fd4 R16 = c000200e0b20fdc0 R01 = c000200e0b20f8d0 R17 = 0048 R02 = c16eb400 R18 = 0001fe80 R03 = 0001 R19 = R04 = 0048ca1cff37803d R20 = R05 = 0688 R21 = R06 = 0001 R22 = 0048 R07 = 0687 R23 = 4882d6e3c8b7ab55 R08 = 48ca1cff37802b68 R24 = c000200e5851df01 R09 = R25 = 8882f6ed90e67454 R10 = R26 = c0b2ec6c R11 = c0d10f78 R27 = c00ff901ee00 R12 = 2000 R28 = R13 = cfab7000 R29 = 015004c0 R14 = c000200e4c973fc8 R30 = c000200e5851df01 R15 = c000200e4c974238 R31 = c00ff901ee00 pc = c038a110 kmem_cache_alloc_node+0x2f0/0x350 cfar= c0016e1c arch_local_irq_restore+0x1c/0x90 lr = c038a0fc kmem_cache_alloc_node+0x2dc/0x350 msr = 90009033 cr = 28002844 ctr = c061e1b0 xer = trap = 380 dar = 8882f6ed90e67454 dsisr = c000200e40bd8400 50:mon> t [c000200e0b20f8d0] c0389fd4 kmem_cache_alloc_node+0x1b4/0x350 (unreliable) [c000200e0b20f940] c000
[Kernel-packages] [Bug 1762844] Second attempt of host reboot_boslcp3
--- Comment on attachment From indira.pr...@in.ibm.com 2018-04-12 11:13 EDT--- Attached boslcp3 host console logs during 2nd attempt of reboot ** Attachment added: "Second attempt of host reboot_boslcp3" https://bugs.launchpad.net/bugs/1762844/+attachment/5125468/+files/Secong_attempt_of_reboot_boslcp3.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1762844 Title: ISST-LTE:KVM:Ubuntu1804:BostonLC:boslcp3: Host crashed & enters into xmon after moving to 4.15.0-15.16 kernel Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: Triaged Status in linux source package in Bionic: Triaged Bug description: Problem Description: === Host crashed & enters into xmon after updating to 4.15.0-15.16 kernel kernel. Steps to re-create: == 1. boslcp3 is up with BMC:118 & PNOR: 20180330 levels 2. Installed boslcp3 with latest kernel 4.15.0-13-generic 3. Enabled "-proposed" kernel in /etc/apt/sources.list file 4. Ran sudo apt-get update & apt-get upgrade 5. root@boslcp3:~# ls /boot abi-4.15.0-13-generic retpoline-4.15.0-13-generic abi-4.15.0-15-generic retpoline-4.15.0-15-generic config-4.15.0-13-generic System.map-4.15.0-13-generic config-4.15.0-15-generic System.map-4.15.0-15-generic grub vmlinux initrd.imgvmlinux-4.15.0-13-generic initrd.img-4.15.0-13-generic vmlinux-4.15.0-15-generic initrd.img-4.15.0-15-generic vmlinux.old initrd.img.old 6. Rebooted & booted with 4.15.0-15 kernel 7. Enabled xmon by editing file "vi /etc/default/grub" and ran update-grub 8. Rebooted host. 9. Booted with 4.15.0-15 & provided root/password credentials in login prompt 10. Host crashed & enters into XMON state with 'Unable to handle kernel paging request' root@boslcp3:~# [ 66.295233] Unable to handle kernel paging request for data at address 0x8882f6ed90e9151a [ 66.295297] Faulting instruction address: 0xc038a110 cpu 0x50: Vector: 380 (Data Access Out of Range) at [c692f650] pc: c038a110: kmem_cache_alloc_node+0x2f0/0x350 lr: c038a0fc: kmem_cache_alloc_node+0x2dc/0x350 sp: c692f8d0 msr: 90009033 dar: 8882f6ed90e9151a current = 0xc698fd00 paca= 0xcfab7000 softe: 0irq_happened: 0x01 pid = 1762, comm = systemd-journal Linux version 4.15.0-15-generic (buildd@bos02-ppc64el-002) (gcc version 7.3.0 (Ubuntu 7.3.0-14ubuntu1)) #16-Ubuntu SMP Wed Apr 4 13:57:51 UTC 2018 (Ubuntu 4.15.0-15.16-generic 4.15.15) enter ? for help [c692f8d0] c0389fd4 kmem_cache_alloc_node+0x1b4/0x350 (unreliable) [c692f940] c0b2ec6c __alloc_skb+0x6c/0x220 [c692f9a0] c0b30b6c alloc_skb_with_frags+0x7c/0x2e0 [c692fa30] c0b247cc sock_alloc_send_pskb+0x29c/0x2c0 [c692fae0] c0c5705c unix_dgram_sendmsg+0x15c/0x8f0 [c692fbc0] c0b1ec64 sock_sendmsg+0x64/0x90 [c692fbf0] c0b20abc ___sys_sendmsg+0x31c/0x390 [c692fd90] c0b221ec __sys_sendmsg+0x5c/0xc0 [c692fe30] c000b184 system_call+0x58/0x6c --- Exception: c00 (System Call) at 74826f6fa9c4 SP (75dc5510) is in userspace 50:mon> 50:mon> 10. Attached Host console logs I rebooted the host just to see if it would hit the issue again and this time I didn't even get to the login prompt but it crashed in the same location: 50:mon> r R00 = c0389fd4 R16 = c000200e0b20fdc0 R01 = c000200e0b20f8d0 R17 = 0048 R02 = c16eb400 R18 = 0001fe80 R03 = 0001 R19 = R04 = 0048ca1cff37803d R20 = R05 = 0688 R21 = R06 = 0001 R22 = 0048 R07 = 0687 R23 = 4882d6e3c8b7ab55 R08 = 48ca1cff37802b68 R24 = c000200e5851df01 R09 = R25 = 8882f6ed90e67454 R10 = R26 = c0b2ec6c R11 = c0d10f78 R27 = c00ff901ee00 R12 = 2000 R28 = R13 = cfab7000 R29 = 015004c0 R14 = c000200e4c973fc8 R30 = c000200e5851df01 R15 = c000200e4c974238 R31 = c00ff901ee00 pc = c038a110 kmem_cache_alloc_node+0x2f0/0x350 cfar= c0016e1c arch_local_irq_restore+0x1c/0x90 lr = c038a0fc kmem_cache_alloc_node+0x2dc/0x350 msr = 90009033 cr = 28002844 ctr = c061e1b0 xer = trap = 380 dar = 8882f6ed90e67454 dsisr = c000200e40bd8400 50:mon> t [c000200e0b20f8d0] c0389fd4 kmem_cache_alloc_node+0x1b4/0x350 (unreliable) [c000200e0b20f940] c0b2ec6c
[Kernel-packages] [Bug 1762844] Second attempt of host reboot_boslcp3
--- Comment on attachment From indira.pr...@in.ibm.com 2018-04-12 11:13 EDT--- Attached boslcp3 host console logs during 2nd attempt of reboot ** Attachment added: "Second attempt of host reboot_boslcp3" https://bugs.launchpad.net/bugs/1762844/+attachment/5112353/+files/Secong_attempt_of_reboot_boslcp3.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1762844 Title: ISST-LTE:KVM:Ubuntu1804:BostonLC:boslcp3: Host crashed & enters into xmon after moving to 4.15.0-15.16 kernel Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: Triaged Status in linux source package in Bionic: Triaged Bug description: Problem Description: === Host crashed & enters into xmon after updating to 4.15.0-15.16 kernel kernel. Steps to re-create: == 1. boslcp3 is up with BMC:118 & PNOR: 20180330 levels 2. Installed boslcp3 with latest kernel 4.15.0-13-generic 3. Enabled "-proposed" kernel in /etc/apt/sources.list file 4. Ran sudo apt-get update & apt-get upgrade 5. root@boslcp3:~# ls /boot abi-4.15.0-13-generic retpoline-4.15.0-13-generic abi-4.15.0-15-generic retpoline-4.15.0-15-generic config-4.15.0-13-generic System.map-4.15.0-13-generic config-4.15.0-15-generic System.map-4.15.0-15-generic grub vmlinux initrd.imgvmlinux-4.15.0-13-generic initrd.img-4.15.0-13-generic vmlinux-4.15.0-15-generic initrd.img-4.15.0-15-generic vmlinux.old initrd.img.old 6. Rebooted & booted with 4.15.0-15 kernel 7. Enabled xmon by editing file "vi /etc/default/grub" and ran update-grub 8. Rebooted host. 9. Booted with 4.15.0-15 & provided root/password credentials in login prompt 10. Host crashed & enters into XMON state with 'Unable to handle kernel paging request' root@boslcp3:~# [ 66.295233] Unable to handle kernel paging request for data at address 0x8882f6ed90e9151a [ 66.295297] Faulting instruction address: 0xc038a110 cpu 0x50: Vector: 380 (Data Access Out of Range) at [c692f650] pc: c038a110: kmem_cache_alloc_node+0x2f0/0x350 lr: c038a0fc: kmem_cache_alloc_node+0x2dc/0x350 sp: c692f8d0 msr: 90009033 dar: 8882f6ed90e9151a current = 0xc698fd00 paca= 0xcfab7000 softe: 0irq_happened: 0x01 pid = 1762, comm = systemd-journal Linux version 4.15.0-15-generic (buildd@bos02-ppc64el-002) (gcc version 7.3.0 (Ubuntu 7.3.0-14ubuntu1)) #16-Ubuntu SMP Wed Apr 4 13:57:51 UTC 2018 (Ubuntu 4.15.0-15.16-generic 4.15.15) enter ? for help [c692f8d0] c0389fd4 kmem_cache_alloc_node+0x1b4/0x350 (unreliable) [c692f940] c0b2ec6c __alloc_skb+0x6c/0x220 [c692f9a0] c0b30b6c alloc_skb_with_frags+0x7c/0x2e0 [c692fa30] c0b247cc sock_alloc_send_pskb+0x29c/0x2c0 [c692fae0] c0c5705c unix_dgram_sendmsg+0x15c/0x8f0 [c692fbc0] c0b1ec64 sock_sendmsg+0x64/0x90 [c692fbf0] c0b20abc ___sys_sendmsg+0x31c/0x390 [c692fd90] c0b221ec __sys_sendmsg+0x5c/0xc0 [c692fe30] c000b184 system_call+0x58/0x6c --- Exception: c00 (System Call) at 74826f6fa9c4 SP (75dc5510) is in userspace 50:mon> 50:mon> 10. Attached Host console logs I rebooted the host just to see if it would hit the issue again and this time I didn't even get to the login prompt but it crashed in the same location: 50:mon> r R00 = c0389fd4 R16 = c000200e0b20fdc0 R01 = c000200e0b20f8d0 R17 = 0048 R02 = c16eb400 R18 = 0001fe80 R03 = 0001 R19 = R04 = 0048ca1cff37803d R20 = R05 = 0688 R21 = R06 = 0001 R22 = 0048 R07 = 0687 R23 = 4882d6e3c8b7ab55 R08 = 48ca1cff37802b68 R24 = c000200e5851df01 R09 = R25 = 8882f6ed90e67454 R10 = R26 = c0b2ec6c R11 = c0d10f78 R27 = c00ff901ee00 R12 = 2000 R28 = R13 = cfab7000 R29 = 015004c0 R14 = c000200e4c973fc8 R30 = c000200e5851df01 R15 = c000200e4c974238 R31 = c00ff901ee00 pc = c038a110 kmem_cache_alloc_node+0x2f0/0x350 cfar= c0016e1c arch_local_irq_restore+0x1c/0x90 lr = c038a0fc kmem_cache_alloc_node+0x2dc/0x350 msr = 90009033 cr = 28002844 ctr = c061e1b0 xer = trap = 380 dar = 8882f6ed90e67454 dsisr = c000200e40bd8400 50:mon> t [c000200e0b20f8d0] c0389fd4 kmem_cache_alloc_node+0x1b4/0x350 (unreliable) [c000200e0b20f940] c0b2ec6c