[Kernel-packages] [Bug 1767927] Comment bridged from LTC Bugzilla
--- Comment From cdead...@us.ibm.com 2018-06-20 17:50 EDT--- Ubuntu issue. Removing IBM FW tag. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: Fix Released Status in linux package in Ubuntu: Fix Released Status in linux source package in Bionic: Fix Released Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] Call Trace: [Wed Apr 4 13:38:25 2018] [c00ff596fda0] [c00a3ca4] power9_idle_type+0x24/0x40 (unreliable) [Wed Apr 4 13:38:25 2018] [c00ff596fdc0] [c0ad1240] stop_loop+0x40/0x5c [Wed Apr 4 13:38:25 2018] [c00ff596fdf0] [c0acd9a4] cpuidle_enter_state+0xa4/0x450 [Wed Apr 4 13:38:25 2018] [c00ff596fe50]
[Kernel-packages] [Bug 1767927] Comment bridged from LTC Bugzilla
--- Comment From gwal...@br.ibm.com 2018-06-07 16:54 EDT--- Hi, I have looking for a machine to reproduce then update all of thing to validate it. Tomorrow I hope a boston LC available for testing back. sorry for delay. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: Fix Committed Status in linux package in Ubuntu: Fix Committed Status in linux source package in Bionic: Fix Committed Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] Call Trace: [Wed Apr 4 13:38:25 2018] [c00ff596fda0] [c00a3ca4] power9_idle_type+0x24/0x40 (unreliable) [Wed Apr 4 13:38:25 2018] [c00ff596fdc0] [c0ad1240] stop_loop+0x40/0x5c [Wed Apr 4 13:3
[Kernel-packages] [Bug 1767927] Comment bridged from LTC Bugzilla
--- Comment From cdead...@us.ibm.com 2018-05-14 16:51 EDT--- Ubuntu issue, so removing LC GA1 Mustfix. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: Triaged Status in linux source package in Bionic: Triaged Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] Call Trace: [Wed Apr 4 13:38:25 2018] [c00ff596fda0] [c00a3ca4] power9_idle_type+0x24/0x40 (unreliable) [Wed Apr 4 13:38:25 2018] [c00ff596fdc0] [c0ad1240] stop_loop+0x40/0x5c [Wed Apr 4 13:38:25 2018] [c00ff596fdf0] [c0acd9a4] cpuidle_enter_state+0xa4/0x450 [Wed Apr 4 13:38:25 2018] [c00ff596fe50] [c00
[Kernel-packages] [Bug 1767927] Comment bridged from LTC Bugzilla
--- Comment From cdead...@us.ibm.com 2018-05-09 17:11 EDT--- @haochanh please verify and close. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: Triaged Status in linux source package in Bionic: Triaged Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] Call Trace: [Wed Apr 4 13:38:25 2018] [c00ff596fda0] [c00a3ca4] power9_idle_type+0x24/0x40 (unreliable) [Wed Apr 4 13:38:25 2018] [c00ff596fdc0] [c0ad1240] stop_loop+0x40/0x5c [Wed Apr 4 13:38:25 2018] [c00ff596fdf0] [c0acd9a4] cpuidle_enter_state+0xa4/0x450 [Wed Apr 4 13:38:25 2018] [c00ff596fe50] [c01719
[Kernel-packages] [Bug 1767927] Comment bridged from LTC Bugzilla
--- Comment From cdead...@us.ibm.com 2018-05-07 12:51 EDT--- I'm a bit confuse regarding this issue 1105 and issue 1257. @haochanh Chan can you post the plc log here also. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: Triaged Status in linux source package in Bionic: Triaged Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] Call Trace: [Wed Apr 4 13:38:25 2018] [c00ff596fda0] [c00a3ca4] power9_idle_type+0x24/0x40 (unreliable) [Wed Apr 4 13:38:25 2018] [c00ff596fdc0] [c0ad1240] stop_loop+0x40/0x5c [Wed Apr 4 13:38:25 2018] [c00ff596fdf0] [c0acd9a4] cpuidle_enter_sta
[Kernel-packages] [Bug 1767927] Comment bridged from LTC Bugzilla
--- Comment From gwal...@br.ibm.com 2018-05-07 12:43 EDT--- Hi Canonical, What is wrong in my request to add those patches? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: Triaged Status in linux source package in Bionic: Triaged Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] Call Trace: [Wed Apr 4 13:38:25 2018] [c00ff596fda0] [c00a3ca4] power9_idle_type+0x24/0x40 (unreliable) [Wed Apr 4 13:38:25 2018] [c00ff596fdc0] [c0ad1240] stop_loop+0x40/0x5c [Wed Apr 4 13:38:25 2018] [c00ff596fdf0] [c0acd9a4] cpuidle_enter_state+0xa4/0x450 [Wed Apr 4 13:38:25 2018] [c000
[Kernel-packages] [Bug 1767927] Comment bridged from LTC Bugzilla
--- Comment From gwal...@br.ibm.com 2018-04-30 13:07 EDT--- Submitted https://lists.ubuntu.com/archives/kernel-team/2018-April/092069.html https://lists.ubuntu.com/archives/kernel-team/2018-April/092070.html https://lists.ubuntu.com/archives/kernel-team/2018-April/092071.html -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: New Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] Call Trace: [Wed Apr 4 13:38:25 2018] [c00ff596fda0] [c00a3ca4] power9_idle_type+0x24/0x40 (unreliable) [Wed Apr 4 13:38:25 2018] [c00ff596fdc0] [c0ad1240] stop_loop+0x40/0x5c [Wed Apr 4 13:38:25 2018] [c00