[Kernel-packages] [Bug 1842465] Re: Watchdog error about hard lockup
Patch landed in between in disco's release pocket, hence adjusting to Fix Released. ** Changed in: linux (Ubuntu) Status: Fix Committed => Fix Released ** Changed in: ubuntu-power-systems Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1842465 Title: Watchdog error about hard lockup Status in The Ubuntu-power-systems project: Fix Released Status in linux package in Ubuntu: Fix Released Bug description: ---Problem Description--- Got a message from Watchdog about self-detected hard LOCKUP ---uname output--- Linux power 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:08:34 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux ---Additional Hardware Info--- Architecture:ppc64le Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 4 Core(s) per socket: 16 Socket(s): 2 NUMA node(s):6 Model: 2.2 (pvr 004e 1202) Model name: POWER9, altivec supported CPU max MHz: 3800. CPU min MHz: 2300. L1d cache: 32K L1i cache: 32K L2 cache:512K L3 cache:10240K NUMA node0 CPU(s): 0-63 NUMA node8 CPU(s): 64-127 NUMA node252 CPU(s): NUMA node253 CPU(s): NUMA node254 CPU(s): NUMA node255 CPU(s): --- free totalusedfree shared buff/cache available Mem: 1071807104 5110016 985192768 622944081504320 1056273664 Swap: 2097088 0 2097088 -- lsblk NAMEMAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:01 894.3G 0 disk ??sda18:11 7M 0 part ??sda28:21 894.3G 0 part / sdb 8:16 1 894.3G 0 disk nvme0n1 259:10 2.9T 0 disk /nvmdisk1 --- Machine Type = AC922, bare metal ---Steps to Reproduce--- This problem I encountered when running customer workload and I switched SMT levels from SMT2 to SMT1 and I got a lockup error right away!! this seems to be a different one... postgresql DB daemon was running on the system. Stack trace output: [756383.688067] watchdog: CPU 53 self-detected hard LOCKUP @ _raw_spin_lock+0x54/0xe0 [756383.688068] watchdog: CPU 53 TB:387344180861438, last heartbeat TB:387337108856720 (13812ms ago) [756383.688069] Modules linked in: binfmt_misc veth ipt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 xt_addrtype iptable_filter bpfilter xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc aufs overlay vmx_crypto ofpart cmdlinepart powernv_flash ipmi_powernv opal_prd mtd ipmi_devintf at24 ibmpowernv ipmi_msghandler uio_pdrv_genirq uio sch_fq_codel ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mlx5_ib ib_uverbs ib_core ast crct10dif_vpmsum i2c_algo_bit crc32c_vpmsum ttm mlx5_core drm_kms_helper syscopyarea nvme sysfillrect sysimgblt fb_sys_fops drm nvme_core ahci libahci tls mlxfw devlink tg3 drm_panel_orientation_quirks [756383.688088] CPU: 53 PID: 119744 Comm: postgres Not tainted 5.0.0-23-generic #24~18.04.1-Ubuntu [756383.688088] NIP: c0e0fcc4 LR: c015fd90 CTR: c0600460 [756383.688089] REGS: c07fffb3bd70 TRAP: 0900 Not tainted (5.0.0-23-generic) [756383.688089] MSR: 90009033 CR: 28242824 XER: [756383.688091] CFAR: c0e0fcec IRQMASK: 1 [756383.688092] GPR00: c015fd90 c000206f2cdf7970 c185c700 c00020732ea49100 [756383.688093] GPR04: c000206f2cdf7a38 c000206f2cdf7b00 0001 [756383.688095] GPR08: 0003 807d 8035 fffd [756383.688096] GPR12: 2000 c07c5080 7cde07504dd8 0f495eee0d68 [756383.688097] GPR16: 7fffc0eb2bd7 7fffc0eb2aa0 0f496c289088 7fffc0eb2a74 [756383.688098] GPR20: 0001 0001 [756383.688099] GPR24: c000206f2cdf7a38 c1349100 20732d70 [756383.688100] GPR28: c1891c70 c000206f36d8b400 c1895c78 c00020732ea49100 [756383.688102] NIP [c0e0fcc4] _raw_spin_lock+0x54/0xe0 [756383.688102] LR [c015fd90] __task_rq_lock+0x80/0x150 [756383.688102] Call Trace: [756383.688103] [c000206f2cdf7970] [c000206f2cdf79d0] 0xc000206f2cdf79d0 (unreliable) [756383.688103] [c000206f2cdf79a0] [c07fd3847818] 0xc07fd3847818 [756383.688104] [c000206f2cdf7a10] [c01649c0] try_to_wake_up+0x380/0x710
[Kernel-packages] [Bug 1842465] Re: Watchdog error about hard lockup
Marked as "Fix Committed" as the patchset was picked up automatically by the latest 5.0 stable sync. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1842465 Title: Watchdog error about hard lockup Status in The Ubuntu-power-systems project: Fix Committed Status in linux package in Ubuntu: Fix Committed Bug description: ---Problem Description--- Got a message from Watchdog about self-detected hard LOCKUP ---uname output--- Linux power 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:08:34 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux ---Additional Hardware Info--- Architecture:ppc64le Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 4 Core(s) per socket: 16 Socket(s): 2 NUMA node(s):6 Model: 2.2 (pvr 004e 1202) Model name: POWER9, altivec supported CPU max MHz: 3800. CPU min MHz: 2300. L1d cache: 32K L1i cache: 32K L2 cache:512K L3 cache:10240K NUMA node0 CPU(s): 0-63 NUMA node8 CPU(s): 64-127 NUMA node252 CPU(s): NUMA node253 CPU(s): NUMA node254 CPU(s): NUMA node255 CPU(s): --- free totalusedfree shared buff/cache available Mem: 1071807104 5110016 985192768 622944081504320 1056273664 Swap: 2097088 0 2097088 -- lsblk NAMEMAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:01 894.3G 0 disk ??sda18:11 7M 0 part ??sda28:21 894.3G 0 part / sdb 8:16 1 894.3G 0 disk nvme0n1 259:10 2.9T 0 disk /nvmdisk1 --- Machine Type = AC922, bare metal ---Steps to Reproduce--- This problem I encountered when running customer workload and I switched SMT levels from SMT2 to SMT1 and I got a lockup error right away!! this seems to be a different one... postgresql DB daemon was running on the system. Stack trace output: [756383.688067] watchdog: CPU 53 self-detected hard LOCKUP @ _raw_spin_lock+0x54/0xe0 [756383.688068] watchdog: CPU 53 TB:387344180861438, last heartbeat TB:387337108856720 (13812ms ago) [756383.688069] Modules linked in: binfmt_misc veth ipt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 xt_addrtype iptable_filter bpfilter xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc aufs overlay vmx_crypto ofpart cmdlinepart powernv_flash ipmi_powernv opal_prd mtd ipmi_devintf at24 ibmpowernv ipmi_msghandler uio_pdrv_genirq uio sch_fq_codel ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mlx5_ib ib_uverbs ib_core ast crct10dif_vpmsum i2c_algo_bit crc32c_vpmsum ttm mlx5_core drm_kms_helper syscopyarea nvme sysfillrect sysimgblt fb_sys_fops drm nvme_core ahci libahci tls mlxfw devlink tg3 drm_panel_orientation_quirks [756383.688088] CPU: 53 PID: 119744 Comm: postgres Not tainted 5.0.0-23-generic #24~18.04.1-Ubuntu [756383.688088] NIP: c0e0fcc4 LR: c015fd90 CTR: c0600460 [756383.688089] REGS: c07fffb3bd70 TRAP: 0900 Not tainted (5.0.0-23-generic) [756383.688089] MSR: 90009033 CR: 28242824 XER: [756383.688091] CFAR: c0e0fcec IRQMASK: 1 [756383.688092] GPR00: c015fd90 c000206f2cdf7970 c185c700 c00020732ea49100 [756383.688093] GPR04: c000206f2cdf7a38 c000206f2cdf7b00 0001 [756383.688095] GPR08: 0003 807d 8035 fffd [756383.688096] GPR12: 2000 c07c5080 7cde07504dd8 0f495eee0d68 [756383.688097] GPR16: 7fffc0eb2bd7 7fffc0eb2aa0 0f496c289088 7fffc0eb2a74 [756383.688098] GPR20: 0001 0001 [756383.688099] GPR24: c000206f2cdf7a38 c1349100 20732d70 [756383.688100] GPR28: c1891c70 c000206f36d8b400 c1895c78 c00020732ea49100 [756383.688102] NIP [c0e0fcc4] _raw_spin_lock+0x54/0xe0 [756383.688102] LR [c015fd90] __task_rq_lock+0x80/0x150 [756383.688102] Call Trace: [756383.688103] [c000206f2cdf7970] [c000206f2cdf79d0] 0xc000206f2cdf79d0 (unreliable) [756383.688103] [c000206f2cdf79a0] [c07fd3847818] 0xc07fd3847818 [756383.688104] [c000206f2cdf7a10] [c01649c0] try_to_wake_up+0x380/0x710 [756383.688105] [c000206f2cdf7aa0] [c0164de0] wake_up_q+0x70/0xd0 [756383.688105] [c000206f2cdf7ae0] [c05fab54]
[Kernel-packages] [Bug 1842465] Re: Watchdog error about hard lockup
** Changed in: linux (Ubuntu) Importance: Undecided => High ** Changed in: linux (Ubuntu) Assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) => Canonical Kernel Team (canonical-kernel-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1842465 Title: Watchdog error about hard lockup Status in The Ubuntu-power-systems project: Fix Committed Status in linux package in Ubuntu: Fix Committed Bug description: ---Problem Description--- Got a message from Watchdog about self-detected hard LOCKUP ---uname output--- Linux power 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:08:34 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux ---Additional Hardware Info--- Architecture:ppc64le Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 4 Core(s) per socket: 16 Socket(s): 2 NUMA node(s):6 Model: 2.2 (pvr 004e 1202) Model name: POWER9, altivec supported CPU max MHz: 3800. CPU min MHz: 2300. L1d cache: 32K L1i cache: 32K L2 cache:512K L3 cache:10240K NUMA node0 CPU(s): 0-63 NUMA node8 CPU(s): 64-127 NUMA node252 CPU(s): NUMA node253 CPU(s): NUMA node254 CPU(s): NUMA node255 CPU(s): --- free totalusedfree shared buff/cache available Mem: 1071807104 5110016 985192768 622944081504320 1056273664 Swap: 2097088 0 2097088 -- lsblk NAMEMAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:01 894.3G 0 disk ??sda18:11 7M 0 part ??sda28:21 894.3G 0 part / sdb 8:16 1 894.3G 0 disk nvme0n1 259:10 2.9T 0 disk /nvmdisk1 --- Machine Type = AC922, bare metal ---Steps to Reproduce--- This problem I encountered when running customer workload and I switched SMT levels from SMT2 to SMT1 and I got a lockup error right away!! this seems to be a different one... postgresql DB daemon was running on the system. Stack trace output: [756383.688067] watchdog: CPU 53 self-detected hard LOCKUP @ _raw_spin_lock+0x54/0xe0 [756383.688068] watchdog: CPU 53 TB:387344180861438, last heartbeat TB:387337108856720 (13812ms ago) [756383.688069] Modules linked in: binfmt_misc veth ipt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 xt_addrtype iptable_filter bpfilter xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc aufs overlay vmx_crypto ofpart cmdlinepart powernv_flash ipmi_powernv opal_prd mtd ipmi_devintf at24 ibmpowernv ipmi_msghandler uio_pdrv_genirq uio sch_fq_codel ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mlx5_ib ib_uverbs ib_core ast crct10dif_vpmsum i2c_algo_bit crc32c_vpmsum ttm mlx5_core drm_kms_helper syscopyarea nvme sysfillrect sysimgblt fb_sys_fops drm nvme_core ahci libahci tls mlxfw devlink tg3 drm_panel_orientation_quirks [756383.688088] CPU: 53 PID: 119744 Comm: postgres Not tainted 5.0.0-23-generic #24~18.04.1-Ubuntu [756383.688088] NIP: c0e0fcc4 LR: c015fd90 CTR: c0600460 [756383.688089] REGS: c07fffb3bd70 TRAP: 0900 Not tainted (5.0.0-23-generic) [756383.688089] MSR: 90009033 CR: 28242824 XER: [756383.688091] CFAR: c0e0fcec IRQMASK: 1 [756383.688092] GPR00: c015fd90 c000206f2cdf7970 c185c700 c00020732ea49100 [756383.688093] GPR04: c000206f2cdf7a38 c000206f2cdf7b00 0001 [756383.688095] GPR08: 0003 807d 8035 fffd [756383.688096] GPR12: 2000 c07c5080 7cde07504dd8 0f495eee0d68 [756383.688097] GPR16: 7fffc0eb2bd7 7fffc0eb2aa0 0f496c289088 7fffc0eb2a74 [756383.688098] GPR20: 0001 0001 [756383.688099] GPR24: c000206f2cdf7a38 c1349100 20732d70 [756383.688100] GPR28: c1891c70 c000206f36d8b400 c1895c78 c00020732ea49100 [756383.688102] NIP [c0e0fcc4] _raw_spin_lock+0x54/0xe0 [756383.688102] LR [c015fd90] __task_rq_lock+0x80/0x150 [756383.688102] Call Trace: [756383.688103] [c000206f2cdf7970] [c000206f2cdf79d0] 0xc000206f2cdf79d0 (unreliable) [756383.688103] [c000206f2cdf79a0] [c07fd3847818] 0xc07fd3847818 [756383.688104] [c000206f2cdf7a10] [c01649c0] try_to_wake_up+0x380/0x710 [756383.688105]
[Kernel-packages] [Bug 1842465] Re: Watchdog error about hard lockup
** Changed in: ubuntu-power-systems Status: Confirmed => Fix Committed ** Changed in: linux (Ubuntu) Status: Confirmed => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1842465 Title: Watchdog error about hard lockup Status in The Ubuntu-power-systems project: Fix Committed Status in linux package in Ubuntu: Fix Committed Bug description: ---Problem Description--- Got a message from Watchdog about self-detected hard LOCKUP ---uname output--- Linux power 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:08:34 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux ---Additional Hardware Info--- Architecture:ppc64le Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 4 Core(s) per socket: 16 Socket(s): 2 NUMA node(s):6 Model: 2.2 (pvr 004e 1202) Model name: POWER9, altivec supported CPU max MHz: 3800. CPU min MHz: 2300. L1d cache: 32K L1i cache: 32K L2 cache:512K L3 cache:10240K NUMA node0 CPU(s): 0-63 NUMA node8 CPU(s): 64-127 NUMA node252 CPU(s): NUMA node253 CPU(s): NUMA node254 CPU(s): NUMA node255 CPU(s): --- free totalusedfree shared buff/cache available Mem: 1071807104 5110016 985192768 622944081504320 1056273664 Swap: 2097088 0 2097088 -- lsblk NAMEMAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:01 894.3G 0 disk ??sda18:11 7M 0 part ??sda28:21 894.3G 0 part / sdb 8:16 1 894.3G 0 disk nvme0n1 259:10 2.9T 0 disk /nvmdisk1 --- Machine Type = AC922, bare metal ---Steps to Reproduce--- This problem I encountered when running customer workload and I switched SMT levels from SMT2 to SMT1 and I got a lockup error right away!! this seems to be a different one... postgresql DB daemon was running on the system. Stack trace output: [756383.688067] watchdog: CPU 53 self-detected hard LOCKUP @ _raw_spin_lock+0x54/0xe0 [756383.688068] watchdog: CPU 53 TB:387344180861438, last heartbeat TB:387337108856720 (13812ms ago) [756383.688069] Modules linked in: binfmt_misc veth ipt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 xt_addrtype iptable_filter bpfilter xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc aufs overlay vmx_crypto ofpart cmdlinepart powernv_flash ipmi_powernv opal_prd mtd ipmi_devintf at24 ibmpowernv ipmi_msghandler uio_pdrv_genirq uio sch_fq_codel ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mlx5_ib ib_uverbs ib_core ast crct10dif_vpmsum i2c_algo_bit crc32c_vpmsum ttm mlx5_core drm_kms_helper syscopyarea nvme sysfillrect sysimgblt fb_sys_fops drm nvme_core ahci libahci tls mlxfw devlink tg3 drm_panel_orientation_quirks [756383.688088] CPU: 53 PID: 119744 Comm: postgres Not tainted 5.0.0-23-generic #24~18.04.1-Ubuntu [756383.688088] NIP: c0e0fcc4 LR: c015fd90 CTR: c0600460 [756383.688089] REGS: c07fffb3bd70 TRAP: 0900 Not tainted (5.0.0-23-generic) [756383.688089] MSR: 90009033 CR: 28242824 XER: [756383.688091] CFAR: c0e0fcec IRQMASK: 1 [756383.688092] GPR00: c015fd90 c000206f2cdf7970 c185c700 c00020732ea49100 [756383.688093] GPR04: c000206f2cdf7a38 c000206f2cdf7b00 0001 [756383.688095] GPR08: 0003 807d 8035 fffd [756383.688096] GPR12: 2000 c07c5080 7cde07504dd8 0f495eee0d68 [756383.688097] GPR16: 7fffc0eb2bd7 7fffc0eb2aa0 0f496c289088 7fffc0eb2a74 [756383.688098] GPR20: 0001 0001 [756383.688099] GPR24: c000206f2cdf7a38 c1349100 20732d70 [756383.688100] GPR28: c1891c70 c000206f36d8b400 c1895c78 c00020732ea49100 [756383.688102] NIP [c0e0fcc4] _raw_spin_lock+0x54/0xe0 [756383.688102] LR [c015fd90] __task_rq_lock+0x80/0x150 [756383.688102] Call Trace: [756383.688103] [c000206f2cdf7970] [c000206f2cdf79d0] 0xc000206f2cdf79d0 (unreliable) [756383.688103] [c000206f2cdf79a0] [c07fd3847818] 0xc07fd3847818 [756383.688104] [c000206f2cdf7a10] [c01649c0] try_to_wake_up+0x380/0x710 [756383.688105] [c000206f2cdf7aa0] [c0164de0] wake_up_q+0x70/0xd0
[Kernel-packages] [Bug 1842465] Re: Watchdog error about hard lockup
** Changed in: ubuntu-power-systems Assignee: Canonical Kernel Team (canonical-kernel-team) => Frank Heimes (frank-heimes) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1842465 Title: Watchdog error about hard lockup Status in The Ubuntu-power-systems project: Confirmed Status in linux package in Ubuntu: Confirmed Bug description: ---Problem Description--- Got a message from Watchdog about self-detected hard LOCKUP ---uname output--- Linux power 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:08:34 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux ---Additional Hardware Info--- Architecture:ppc64le Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 4 Core(s) per socket: 16 Socket(s): 2 NUMA node(s):6 Model: 2.2 (pvr 004e 1202) Model name: POWER9, altivec supported CPU max MHz: 3800. CPU min MHz: 2300. L1d cache: 32K L1i cache: 32K L2 cache:512K L3 cache:10240K NUMA node0 CPU(s): 0-63 NUMA node8 CPU(s): 64-127 NUMA node252 CPU(s): NUMA node253 CPU(s): NUMA node254 CPU(s): NUMA node255 CPU(s): --- free totalusedfree shared buff/cache available Mem: 1071807104 5110016 985192768 622944081504320 1056273664 Swap: 2097088 0 2097088 -- lsblk NAMEMAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:01 894.3G 0 disk ??sda18:11 7M 0 part ??sda28:21 894.3G 0 part / sdb 8:16 1 894.3G 0 disk nvme0n1 259:10 2.9T 0 disk /nvmdisk1 --- Machine Type = AC922, bare metal ---Steps to Reproduce--- This problem I encountered when running customer workload and I switched SMT levels from SMT2 to SMT1 and I got a lockup error right away!! this seems to be a different one... postgresql DB daemon was running on the system. Stack trace output: [756383.688067] watchdog: CPU 53 self-detected hard LOCKUP @ _raw_spin_lock+0x54/0xe0 [756383.688068] watchdog: CPU 53 TB:387344180861438, last heartbeat TB:387337108856720 (13812ms ago) [756383.688069] Modules linked in: binfmt_misc veth ipt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 xt_addrtype iptable_filter bpfilter xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc aufs overlay vmx_crypto ofpart cmdlinepart powernv_flash ipmi_powernv opal_prd mtd ipmi_devintf at24 ibmpowernv ipmi_msghandler uio_pdrv_genirq uio sch_fq_codel ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mlx5_ib ib_uverbs ib_core ast crct10dif_vpmsum i2c_algo_bit crc32c_vpmsum ttm mlx5_core drm_kms_helper syscopyarea nvme sysfillrect sysimgblt fb_sys_fops drm nvme_core ahci libahci tls mlxfw devlink tg3 drm_panel_orientation_quirks [756383.688088] CPU: 53 PID: 119744 Comm: postgres Not tainted 5.0.0-23-generic #24~18.04.1-Ubuntu [756383.688088] NIP: c0e0fcc4 LR: c015fd90 CTR: c0600460 [756383.688089] REGS: c07fffb3bd70 TRAP: 0900 Not tainted (5.0.0-23-generic) [756383.688089] MSR: 90009033 CR: 28242824 XER: [756383.688091] CFAR: c0e0fcec IRQMASK: 1 [756383.688092] GPR00: c015fd90 c000206f2cdf7970 c185c700 c00020732ea49100 [756383.688093] GPR04: c000206f2cdf7a38 c000206f2cdf7b00 0001 [756383.688095] GPR08: 0003 807d 8035 fffd [756383.688096] GPR12: 2000 c07c5080 7cde07504dd8 0f495eee0d68 [756383.688097] GPR16: 7fffc0eb2bd7 7fffc0eb2aa0 0f496c289088 7fffc0eb2a74 [756383.688098] GPR20: 0001 0001 [756383.688099] GPR24: c000206f2cdf7a38 c1349100 20732d70 [756383.688100] GPR28: c1891c70 c000206f36d8b400 c1895c78 c00020732ea49100 [756383.688102] NIP [c0e0fcc4] _raw_spin_lock+0x54/0xe0 [756383.688102] LR [c015fd90] __task_rq_lock+0x80/0x150 [756383.688102] Call Trace: [756383.688103] [c000206f2cdf7970] [c000206f2cdf79d0] 0xc000206f2cdf79d0 (unreliable) [756383.688103] [c000206f2cdf79a0] [c07fd3847818] 0xc07fd3847818 [756383.688104] [c000206f2cdf7a10] [c01649c0] try_to_wake_up+0x380/0x710 [756383.688105] [c000206f2cdf7aa0] [c0164de0] wake_up_q+0x70/0xd0 [756383.688105] [c000206f2cdf7ae0]
[Kernel-packages] [Bug 1842465] Re: Watchdog error about hard lockup
The commit mentioned above is already in disco master-next: ~/ubuntu-disco-master-next/ubuntu-disco-clean$ git log --oneline | grep -m 1 "powerpc/watchdog: Use hrtimers for per-CPU heartbeat" fad8027 powerpc/watchdog: Use hrtimers for per-CPU heartbeat but not yet tagged. ** Changed in: linux (Ubuntu) Status: New => Confirmed ** Changed in: ubuntu-power-systems Status: New => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1842465 Title: Watchdog error about hard lockup Status in The Ubuntu-power-systems project: Confirmed Status in linux package in Ubuntu: Confirmed Bug description: ---Problem Description--- Got a message from Watchdog about self-detected hard LOCKUP ---uname output--- Linux power 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:08:34 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux ---Additional Hardware Info--- Architecture:ppc64le Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 4 Core(s) per socket: 16 Socket(s): 2 NUMA node(s):6 Model: 2.2 (pvr 004e 1202) Model name: POWER9, altivec supported CPU max MHz: 3800. CPU min MHz: 2300. L1d cache: 32K L1i cache: 32K L2 cache:512K L3 cache:10240K NUMA node0 CPU(s): 0-63 NUMA node8 CPU(s): 64-127 NUMA node252 CPU(s): NUMA node253 CPU(s): NUMA node254 CPU(s): NUMA node255 CPU(s): --- free totalusedfree shared buff/cache available Mem: 1071807104 5110016 985192768 622944081504320 1056273664 Swap: 2097088 0 2097088 -- lsblk NAMEMAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:01 894.3G 0 disk ??sda18:11 7M 0 part ??sda28:21 894.3G 0 part / sdb 8:16 1 894.3G 0 disk nvme0n1 259:10 2.9T 0 disk /nvmdisk1 --- Machine Type = AC922, bare metal ---Steps to Reproduce--- This problem I encountered when running customer workload and I switched SMT levels from SMT2 to SMT1 and I got a lockup error right away!! this seems to be a different one... postgresql DB daemon was running on the system. Stack trace output: [756383.688067] watchdog: CPU 53 self-detected hard LOCKUP @ _raw_spin_lock+0x54/0xe0 [756383.688068] watchdog: CPU 53 TB:387344180861438, last heartbeat TB:387337108856720 (13812ms ago) [756383.688069] Modules linked in: binfmt_misc veth ipt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 xt_addrtype iptable_filter bpfilter xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc aufs overlay vmx_crypto ofpart cmdlinepart powernv_flash ipmi_powernv opal_prd mtd ipmi_devintf at24 ibmpowernv ipmi_msghandler uio_pdrv_genirq uio sch_fq_codel ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mlx5_ib ib_uverbs ib_core ast crct10dif_vpmsum i2c_algo_bit crc32c_vpmsum ttm mlx5_core drm_kms_helper syscopyarea nvme sysfillrect sysimgblt fb_sys_fops drm nvme_core ahci libahci tls mlxfw devlink tg3 drm_panel_orientation_quirks [756383.688088] CPU: 53 PID: 119744 Comm: postgres Not tainted 5.0.0-23-generic #24~18.04.1-Ubuntu [756383.688088] NIP: c0e0fcc4 LR: c015fd90 CTR: c0600460 [756383.688089] REGS: c07fffb3bd70 TRAP: 0900 Not tainted (5.0.0-23-generic) [756383.688089] MSR: 90009033 CR: 28242824 XER: [756383.688091] CFAR: c0e0fcec IRQMASK: 1 [756383.688092] GPR00: c015fd90 c000206f2cdf7970 c185c700 c00020732ea49100 [756383.688093] GPR04: c000206f2cdf7a38 c000206f2cdf7b00 0001 [756383.688095] GPR08: 0003 807d 8035 fffd [756383.688096] GPR12: 2000 c07c5080 7cde07504dd8 0f495eee0d68 [756383.688097] GPR16: 7fffc0eb2bd7 7fffc0eb2aa0 0f496c289088 7fffc0eb2a74 [756383.688098] GPR20: 0001 0001 [756383.688099] GPR24: c000206f2cdf7a38 c1349100 20732d70 [756383.688100] GPR28: c1891c70 c000206f36d8b400 c1895c78 c00020732ea49100 [756383.688102] NIP [c0e0fcc4] _raw_spin_lock+0x54/0xe0 [756383.688102] LR [c015fd90] __task_rq_lock+0x80/0x150 [756383.688102] Call Trace: [756383.688103] [c000206f2cdf7970] [c000206f2cdf79d0] 0xc000206f2cdf79d0 (unreliable)
[Kernel-packages] [Bug 1842465] Re: Watchdog error about hard lockup
** Also affects: ubuntu-power-systems Importance: Undecided Status: New ** Changed in: ubuntu-power-systems Assignee: (unassigned) => Canonical Kernel Team (canonical-kernel-team) ** Changed in: ubuntu-power-systems Importance: Undecided => High -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1842465 Title: Watchdog error about hard lockup Status in The Ubuntu-power-systems project: New Status in linux package in Ubuntu: New Bug description: ---Problem Description--- Got a message from Watchdog about self-detected hard LOCKUP ---uname output--- Linux power 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:08:34 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux ---Additional Hardware Info--- Architecture:ppc64le Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 4 Core(s) per socket: 16 Socket(s): 2 NUMA node(s):6 Model: 2.2 (pvr 004e 1202) Model name: POWER9, altivec supported CPU max MHz: 3800. CPU min MHz: 2300. L1d cache: 32K L1i cache: 32K L2 cache:512K L3 cache:10240K NUMA node0 CPU(s): 0-63 NUMA node8 CPU(s): 64-127 NUMA node252 CPU(s): NUMA node253 CPU(s): NUMA node254 CPU(s): NUMA node255 CPU(s): --- free totalusedfree shared buff/cache available Mem: 1071807104 5110016 985192768 622944081504320 1056273664 Swap: 2097088 0 2097088 -- lsblk NAMEMAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:01 894.3G 0 disk ??sda18:11 7M 0 part ??sda28:21 894.3G 0 part / sdb 8:16 1 894.3G 0 disk nvme0n1 259:10 2.9T 0 disk /nvmdisk1 --- Machine Type = AC922, bare metal ---Steps to Reproduce--- This problem I encountered when running customer workload and I switched SMT levels from SMT2 to SMT1 and I got a lockup error right away!! this seems to be a different one... postgresql DB daemon was running on the system. Stack trace output: [756383.688067] watchdog: CPU 53 self-detected hard LOCKUP @ _raw_spin_lock+0x54/0xe0 [756383.688068] watchdog: CPU 53 TB:387344180861438, last heartbeat TB:387337108856720 (13812ms ago) [756383.688069] Modules linked in: binfmt_misc veth ipt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 xt_addrtype iptable_filter bpfilter xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc aufs overlay vmx_crypto ofpart cmdlinepart powernv_flash ipmi_powernv opal_prd mtd ipmi_devintf at24 ibmpowernv ipmi_msghandler uio_pdrv_genirq uio sch_fq_codel ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mlx5_ib ib_uverbs ib_core ast crct10dif_vpmsum i2c_algo_bit crc32c_vpmsum ttm mlx5_core drm_kms_helper syscopyarea nvme sysfillrect sysimgblt fb_sys_fops drm nvme_core ahci libahci tls mlxfw devlink tg3 drm_panel_orientation_quirks [756383.688088] CPU: 53 PID: 119744 Comm: postgres Not tainted 5.0.0-23-generic #24~18.04.1-Ubuntu [756383.688088] NIP: c0e0fcc4 LR: c015fd90 CTR: c0600460 [756383.688089] REGS: c07fffb3bd70 TRAP: 0900 Not tainted (5.0.0-23-generic) [756383.688089] MSR: 90009033 CR: 28242824 XER: [756383.688091] CFAR: c0e0fcec IRQMASK: 1 [756383.688092] GPR00: c015fd90 c000206f2cdf7970 c185c700 c00020732ea49100 [756383.688093] GPR04: c000206f2cdf7a38 c000206f2cdf7b00 0001 [756383.688095] GPR08: 0003 807d 8035 fffd [756383.688096] GPR12: 2000 c07c5080 7cde07504dd8 0f495eee0d68 [756383.688097] GPR16: 7fffc0eb2bd7 7fffc0eb2aa0 0f496c289088 7fffc0eb2a74 [756383.688098] GPR20: 0001 0001 [756383.688099] GPR24: c000206f2cdf7a38 c1349100 20732d70 [756383.688100] GPR28: c1891c70 c000206f36d8b400 c1895c78 c00020732ea49100 [756383.688102] NIP [c0e0fcc4] _raw_spin_lock+0x54/0xe0 [756383.688102] LR [c015fd90] __task_rq_lock+0x80/0x150 [756383.688102] Call Trace: [756383.688103] [c000206f2cdf7970] [c000206f2cdf79d0] 0xc000206f2cdf79d0 (unreliable) [756383.688103] [c000206f2cdf79a0] [c07fd3847818] 0xc07fd3847818 [756383.688104] [c000206f2cdf7a10] [c01649c0]