[Kernel-packages] [Bug 1842465] Re: Watchdog error about hard lockup

2019-10-07 Thread Frank Heimes
Patch landed in between in disco's release pocket, hence adjusting to
Fix Released.

** Changed in: linux (Ubuntu)
   Status: Fix Committed => Fix Released

** Changed in: ubuntu-power-systems
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1842465

Title:
  Watchdog error about hard lockup

Status in The Ubuntu-power-systems project:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released

Bug description:
  ---Problem Description---
  Got a message from Watchdog about self-detected hard LOCKUP
   
  ---uname output---
  Linux power 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:08:34 UTC 
2019 ppc64le ppc64le ppc64le GNU/Linux
   
  ---Additional Hardware Info---
  Architecture:ppc64le
  Byte Order:  Little Endian
  CPU(s):  128
  On-line CPU(s) list: 0-127
  Thread(s) per core:  4
  Core(s) per socket:  16
  Socket(s):   2
  NUMA node(s):6
  Model:   2.2 (pvr 004e 1202)
  Model name:  POWER9, altivec supported
  CPU max MHz: 3800.
  CPU min MHz: 2300.
  L1d cache:   32K
  L1i cache:   32K
  L2 cache:512K
  L3 cache:10240K
  NUMA node0 CPU(s):   0-63
  NUMA node8 CPU(s):   64-127
  NUMA node252 CPU(s):
  NUMA node253 CPU(s):
  NUMA node254 CPU(s):
  NUMA node255 CPU(s):
  ---
  free
totalusedfree  shared  buff/cache   
available
  Mem: 1071807104 5110016   985192768 622944081504320  
1056273664
  Swap:   2097088   0 2097088
  --
  lsblk
  NAMEMAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
  sda   8:01 894.3G  0 disk
  ??sda18:11 7M  0 part
  ??sda28:21 894.3G  0 part /
  sdb   8:16   1 894.3G  0 disk
  nvme0n1 259:10   2.9T  0 disk /nvmdisk1
  ---
   
  Machine Type = AC922, bare metal 
   
  ---Steps to Reproduce---
   This problem I encountered when running customer workload and I switched SMT 
levels from SMT2 to SMT1 and I got a 
  lockup error right away!! this seems to be a different one... postgresql DB 
daemon was running on the system.
   
  Stack trace output:
   [756383.688067] watchdog: CPU 53 self-detected hard LOCKUP @ 
_raw_spin_lock+0x54/0xe0
  [756383.688068] watchdog: CPU 53 TB:387344180861438, last heartbeat 
TB:387337108856720 (13812ms ago)
  [756383.688069] Modules linked in: binfmt_misc veth ipt_MASQUERADE 
nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 
xt_addrtype iptable_filter bpfilter xt_conntrack nf_nat nf_conntrack 
nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc aufs overlay 
vmx_crypto ofpart cmdlinepart powernv_flash ipmi_powernv opal_prd mtd 
ipmi_devintf at24 ibmpowernv ipmi_msghandler uio_pdrv_genirq uio sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear mlx5_ib ib_uverbs ib_core ast 
crct10dif_vpmsum i2c_algo_bit crc32c_vpmsum ttm mlx5_core drm_kms_helper 
syscopyarea nvme sysfillrect sysimgblt fb_sys_fops drm nvme_core ahci libahci 
tls mlxfw devlink tg3 drm_panel_orientation_quirks
  [756383.688088] CPU: 53 PID: 119744 Comm: postgres Not tainted 
5.0.0-23-generic #24~18.04.1-Ubuntu
  [756383.688088] NIP:  c0e0fcc4 LR: c015fd90 CTR: 
c0600460
  [756383.688089] REGS: c07fffb3bd70 TRAP: 0900   Not tainted  
(5.0.0-23-generic)
  [756383.688089] MSR:  90009033   CR: 
28242824  XER: 
  [756383.688091] CFAR: c0e0fcec IRQMASK: 1 
  [756383.688092] GPR00: c015fd90 c000206f2cdf7970 c185c700 
c00020732ea49100 
  [756383.688093] GPR04: c000206f2cdf7a38  c000206f2cdf7b00 
0001 
  [756383.688095] GPR08: 0003 807d 8035 
fffd 
  [756383.688096] GPR12: 2000 c07c5080 7cde07504dd8 
0f495eee0d68 
  [756383.688097] GPR16: 7fffc0eb2bd7 7fffc0eb2aa0 0f496c289088 
7fffc0eb2a74 
  [756383.688098] GPR20:  0001 0001 
 
  [756383.688099] GPR24:  c000206f2cdf7a38 c1349100 
20732d70 
  [756383.688100] GPR28: c1891c70 c000206f36d8b400 c1895c78 
c00020732ea49100 
  [756383.688102] NIP [c0e0fcc4] _raw_spin_lock+0x54/0xe0
  [756383.688102] LR [c015fd90] __task_rq_lock+0x80/0x150
  [756383.688102] Call Trace:
  [756383.688103] [c000206f2cdf7970] [c000206f2cdf79d0] 0xc000206f2cdf79d0 
(unreliable)
  [756383.688103] [c000206f2cdf79a0] [c07fd3847818] 0xc07fd3847818
  [756383.688104] [c000206f2cdf7a10] [c01649c0] 
try_to_wake_up+0x380/0x710
 

[Kernel-packages] [Bug 1842465] Re: Watchdog error about hard lockup

2019-09-16 Thread Andrew Cloke
Marked as "Fix Committed" as the patchset was picked up automatically by
the latest 5.0 stable sync.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1842465

Title:
  Watchdog error about hard lockup

Status in The Ubuntu-power-systems project:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Committed

Bug description:
  ---Problem Description---
  Got a message from Watchdog about self-detected hard LOCKUP
   
  ---uname output---
  Linux power 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:08:34 UTC 
2019 ppc64le ppc64le ppc64le GNU/Linux
   
  ---Additional Hardware Info---
  Architecture:ppc64le
  Byte Order:  Little Endian
  CPU(s):  128
  On-line CPU(s) list: 0-127
  Thread(s) per core:  4
  Core(s) per socket:  16
  Socket(s):   2
  NUMA node(s):6
  Model:   2.2 (pvr 004e 1202)
  Model name:  POWER9, altivec supported
  CPU max MHz: 3800.
  CPU min MHz: 2300.
  L1d cache:   32K
  L1i cache:   32K
  L2 cache:512K
  L3 cache:10240K
  NUMA node0 CPU(s):   0-63
  NUMA node8 CPU(s):   64-127
  NUMA node252 CPU(s):
  NUMA node253 CPU(s):
  NUMA node254 CPU(s):
  NUMA node255 CPU(s):
  ---
  free
totalusedfree  shared  buff/cache   
available
  Mem: 1071807104 5110016   985192768 622944081504320  
1056273664
  Swap:   2097088   0 2097088
  --
  lsblk
  NAMEMAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
  sda   8:01 894.3G  0 disk
  ??sda18:11 7M  0 part
  ??sda28:21 894.3G  0 part /
  sdb   8:16   1 894.3G  0 disk
  nvme0n1 259:10   2.9T  0 disk /nvmdisk1
  ---
   
  Machine Type = AC922, bare metal 
   
  ---Steps to Reproduce---
   This problem I encountered when running customer workload and I switched SMT 
levels from SMT2 to SMT1 and I got a 
  lockup error right away!! this seems to be a different one... postgresql DB 
daemon was running on the system.
   
  Stack trace output:
   [756383.688067] watchdog: CPU 53 self-detected hard LOCKUP @ 
_raw_spin_lock+0x54/0xe0
  [756383.688068] watchdog: CPU 53 TB:387344180861438, last heartbeat 
TB:387337108856720 (13812ms ago)
  [756383.688069] Modules linked in: binfmt_misc veth ipt_MASQUERADE 
nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 
xt_addrtype iptable_filter bpfilter xt_conntrack nf_nat nf_conntrack 
nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc aufs overlay 
vmx_crypto ofpart cmdlinepart powernv_flash ipmi_powernv opal_prd mtd 
ipmi_devintf at24 ibmpowernv ipmi_msghandler uio_pdrv_genirq uio sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear mlx5_ib ib_uverbs ib_core ast 
crct10dif_vpmsum i2c_algo_bit crc32c_vpmsum ttm mlx5_core drm_kms_helper 
syscopyarea nvme sysfillrect sysimgblt fb_sys_fops drm nvme_core ahci libahci 
tls mlxfw devlink tg3 drm_panel_orientation_quirks
  [756383.688088] CPU: 53 PID: 119744 Comm: postgres Not tainted 
5.0.0-23-generic #24~18.04.1-Ubuntu
  [756383.688088] NIP:  c0e0fcc4 LR: c015fd90 CTR: 
c0600460
  [756383.688089] REGS: c07fffb3bd70 TRAP: 0900   Not tainted  
(5.0.0-23-generic)
  [756383.688089] MSR:  90009033   CR: 
28242824  XER: 
  [756383.688091] CFAR: c0e0fcec IRQMASK: 1 
  [756383.688092] GPR00: c015fd90 c000206f2cdf7970 c185c700 
c00020732ea49100 
  [756383.688093] GPR04: c000206f2cdf7a38  c000206f2cdf7b00 
0001 
  [756383.688095] GPR08: 0003 807d 8035 
fffd 
  [756383.688096] GPR12: 2000 c07c5080 7cde07504dd8 
0f495eee0d68 
  [756383.688097] GPR16: 7fffc0eb2bd7 7fffc0eb2aa0 0f496c289088 
7fffc0eb2a74 
  [756383.688098] GPR20:  0001 0001 
 
  [756383.688099] GPR24:  c000206f2cdf7a38 c1349100 
20732d70 
  [756383.688100] GPR28: c1891c70 c000206f36d8b400 c1895c78 
c00020732ea49100 
  [756383.688102] NIP [c0e0fcc4] _raw_spin_lock+0x54/0xe0
  [756383.688102] LR [c015fd90] __task_rq_lock+0x80/0x150
  [756383.688102] Call Trace:
  [756383.688103] [c000206f2cdf7970] [c000206f2cdf79d0] 0xc000206f2cdf79d0 
(unreliable)
  [756383.688103] [c000206f2cdf79a0] [c07fd3847818] 0xc07fd3847818
  [756383.688104] [c000206f2cdf7a10] [c01649c0] 
try_to_wake_up+0x380/0x710
  [756383.688105] [c000206f2cdf7aa0] [c0164de0] wake_up_q+0x70/0xd0
  [756383.688105] [c000206f2cdf7ae0] [c05fab54] 

[Kernel-packages] [Bug 1842465] Re: Watchdog error about hard lockup

2019-09-12 Thread Manoj Iyer
** Changed in: linux (Ubuntu)
   Importance: Undecided => High

** Changed in: linux (Ubuntu)
 Assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) => 
Canonical Kernel Team (canonical-kernel-team)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1842465

Title:
  Watchdog error about hard lockup

Status in The Ubuntu-power-systems project:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Committed

Bug description:
  ---Problem Description---
  Got a message from Watchdog about self-detected hard LOCKUP
   
  ---uname output---
  Linux power 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:08:34 UTC 
2019 ppc64le ppc64le ppc64le GNU/Linux
   
  ---Additional Hardware Info---
  Architecture:ppc64le
  Byte Order:  Little Endian
  CPU(s):  128
  On-line CPU(s) list: 0-127
  Thread(s) per core:  4
  Core(s) per socket:  16
  Socket(s):   2
  NUMA node(s):6
  Model:   2.2 (pvr 004e 1202)
  Model name:  POWER9, altivec supported
  CPU max MHz: 3800.
  CPU min MHz: 2300.
  L1d cache:   32K
  L1i cache:   32K
  L2 cache:512K
  L3 cache:10240K
  NUMA node0 CPU(s):   0-63
  NUMA node8 CPU(s):   64-127
  NUMA node252 CPU(s):
  NUMA node253 CPU(s):
  NUMA node254 CPU(s):
  NUMA node255 CPU(s):
  ---
  free
totalusedfree  shared  buff/cache   
available
  Mem: 1071807104 5110016   985192768 622944081504320  
1056273664
  Swap:   2097088   0 2097088
  --
  lsblk
  NAMEMAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
  sda   8:01 894.3G  0 disk
  ??sda18:11 7M  0 part
  ??sda28:21 894.3G  0 part /
  sdb   8:16   1 894.3G  0 disk
  nvme0n1 259:10   2.9T  0 disk /nvmdisk1
  ---
   
  Machine Type = AC922, bare metal 
   
  ---Steps to Reproduce---
   This problem I encountered when running customer workload and I switched SMT 
levels from SMT2 to SMT1 and I got a 
  lockup error right away!! this seems to be a different one... postgresql DB 
daemon was running on the system.
   
  Stack trace output:
   [756383.688067] watchdog: CPU 53 self-detected hard LOCKUP @ 
_raw_spin_lock+0x54/0xe0
  [756383.688068] watchdog: CPU 53 TB:387344180861438, last heartbeat 
TB:387337108856720 (13812ms ago)
  [756383.688069] Modules linked in: binfmt_misc veth ipt_MASQUERADE 
nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 
xt_addrtype iptable_filter bpfilter xt_conntrack nf_nat nf_conntrack 
nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc aufs overlay 
vmx_crypto ofpart cmdlinepart powernv_flash ipmi_powernv opal_prd mtd 
ipmi_devintf at24 ibmpowernv ipmi_msghandler uio_pdrv_genirq uio sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear mlx5_ib ib_uverbs ib_core ast 
crct10dif_vpmsum i2c_algo_bit crc32c_vpmsum ttm mlx5_core drm_kms_helper 
syscopyarea nvme sysfillrect sysimgblt fb_sys_fops drm nvme_core ahci libahci 
tls mlxfw devlink tg3 drm_panel_orientation_quirks
  [756383.688088] CPU: 53 PID: 119744 Comm: postgres Not tainted 
5.0.0-23-generic #24~18.04.1-Ubuntu
  [756383.688088] NIP:  c0e0fcc4 LR: c015fd90 CTR: 
c0600460
  [756383.688089] REGS: c07fffb3bd70 TRAP: 0900   Not tainted  
(5.0.0-23-generic)
  [756383.688089] MSR:  90009033   CR: 
28242824  XER: 
  [756383.688091] CFAR: c0e0fcec IRQMASK: 1 
  [756383.688092] GPR00: c015fd90 c000206f2cdf7970 c185c700 
c00020732ea49100 
  [756383.688093] GPR04: c000206f2cdf7a38  c000206f2cdf7b00 
0001 
  [756383.688095] GPR08: 0003 807d 8035 
fffd 
  [756383.688096] GPR12: 2000 c07c5080 7cde07504dd8 
0f495eee0d68 
  [756383.688097] GPR16: 7fffc0eb2bd7 7fffc0eb2aa0 0f496c289088 
7fffc0eb2a74 
  [756383.688098] GPR20:  0001 0001 
 
  [756383.688099] GPR24:  c000206f2cdf7a38 c1349100 
20732d70 
  [756383.688100] GPR28: c1891c70 c000206f36d8b400 c1895c78 
c00020732ea49100 
  [756383.688102] NIP [c0e0fcc4] _raw_spin_lock+0x54/0xe0
  [756383.688102] LR [c015fd90] __task_rq_lock+0x80/0x150
  [756383.688102] Call Trace:
  [756383.688103] [c000206f2cdf7970] [c000206f2cdf79d0] 0xc000206f2cdf79d0 
(unreliable)
  [756383.688103] [c000206f2cdf79a0] [c07fd3847818] 0xc07fd3847818
  [756383.688104] [c000206f2cdf7a10] [c01649c0] 
try_to_wake_up+0x380/0x710
  [756383.688105] 

[Kernel-packages] [Bug 1842465] Re: Watchdog error about hard lockup

2019-09-12 Thread Frank Heimes
** Changed in: ubuntu-power-systems
   Status: Confirmed => Fix Committed

** Changed in: linux (Ubuntu)
   Status: Confirmed => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1842465

Title:
  Watchdog error about hard lockup

Status in The Ubuntu-power-systems project:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Committed

Bug description:
  ---Problem Description---
  Got a message from Watchdog about self-detected hard LOCKUP
   
  ---uname output---
  Linux power 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:08:34 UTC 
2019 ppc64le ppc64le ppc64le GNU/Linux
   
  ---Additional Hardware Info---
  Architecture:ppc64le
  Byte Order:  Little Endian
  CPU(s):  128
  On-line CPU(s) list: 0-127
  Thread(s) per core:  4
  Core(s) per socket:  16
  Socket(s):   2
  NUMA node(s):6
  Model:   2.2 (pvr 004e 1202)
  Model name:  POWER9, altivec supported
  CPU max MHz: 3800.
  CPU min MHz: 2300.
  L1d cache:   32K
  L1i cache:   32K
  L2 cache:512K
  L3 cache:10240K
  NUMA node0 CPU(s):   0-63
  NUMA node8 CPU(s):   64-127
  NUMA node252 CPU(s):
  NUMA node253 CPU(s):
  NUMA node254 CPU(s):
  NUMA node255 CPU(s):
  ---
  free
totalusedfree  shared  buff/cache   
available
  Mem: 1071807104 5110016   985192768 622944081504320  
1056273664
  Swap:   2097088   0 2097088
  --
  lsblk
  NAMEMAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
  sda   8:01 894.3G  0 disk
  ??sda18:11 7M  0 part
  ??sda28:21 894.3G  0 part /
  sdb   8:16   1 894.3G  0 disk
  nvme0n1 259:10   2.9T  0 disk /nvmdisk1
  ---
   
  Machine Type = AC922, bare metal 
   
  ---Steps to Reproduce---
   This problem I encountered when running customer workload and I switched SMT 
levels from SMT2 to SMT1 and I got a 
  lockup error right away!! this seems to be a different one... postgresql DB 
daemon was running on the system.
   
  Stack trace output:
   [756383.688067] watchdog: CPU 53 self-detected hard LOCKUP @ 
_raw_spin_lock+0x54/0xe0
  [756383.688068] watchdog: CPU 53 TB:387344180861438, last heartbeat 
TB:387337108856720 (13812ms ago)
  [756383.688069] Modules linked in: binfmt_misc veth ipt_MASQUERADE 
nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 
xt_addrtype iptable_filter bpfilter xt_conntrack nf_nat nf_conntrack 
nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc aufs overlay 
vmx_crypto ofpart cmdlinepart powernv_flash ipmi_powernv opal_prd mtd 
ipmi_devintf at24 ibmpowernv ipmi_msghandler uio_pdrv_genirq uio sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear mlx5_ib ib_uverbs ib_core ast 
crct10dif_vpmsum i2c_algo_bit crc32c_vpmsum ttm mlx5_core drm_kms_helper 
syscopyarea nvme sysfillrect sysimgblt fb_sys_fops drm nvme_core ahci libahci 
tls mlxfw devlink tg3 drm_panel_orientation_quirks
  [756383.688088] CPU: 53 PID: 119744 Comm: postgres Not tainted 
5.0.0-23-generic #24~18.04.1-Ubuntu
  [756383.688088] NIP:  c0e0fcc4 LR: c015fd90 CTR: 
c0600460
  [756383.688089] REGS: c07fffb3bd70 TRAP: 0900   Not tainted  
(5.0.0-23-generic)
  [756383.688089] MSR:  90009033   CR: 
28242824  XER: 
  [756383.688091] CFAR: c0e0fcec IRQMASK: 1 
  [756383.688092] GPR00: c015fd90 c000206f2cdf7970 c185c700 
c00020732ea49100 
  [756383.688093] GPR04: c000206f2cdf7a38  c000206f2cdf7b00 
0001 
  [756383.688095] GPR08: 0003 807d 8035 
fffd 
  [756383.688096] GPR12: 2000 c07c5080 7cde07504dd8 
0f495eee0d68 
  [756383.688097] GPR16: 7fffc0eb2bd7 7fffc0eb2aa0 0f496c289088 
7fffc0eb2a74 
  [756383.688098] GPR20:  0001 0001 
 
  [756383.688099] GPR24:  c000206f2cdf7a38 c1349100 
20732d70 
  [756383.688100] GPR28: c1891c70 c000206f36d8b400 c1895c78 
c00020732ea49100 
  [756383.688102] NIP [c0e0fcc4] _raw_spin_lock+0x54/0xe0
  [756383.688102] LR [c015fd90] __task_rq_lock+0x80/0x150
  [756383.688102] Call Trace:
  [756383.688103] [c000206f2cdf7970] [c000206f2cdf79d0] 0xc000206f2cdf79d0 
(unreliable)
  [756383.688103] [c000206f2cdf79a0] [c07fd3847818] 0xc07fd3847818
  [756383.688104] [c000206f2cdf7a10] [c01649c0] 
try_to_wake_up+0x380/0x710
  [756383.688105] [c000206f2cdf7aa0] [c0164de0] wake_up_q+0x70/0xd0
  

[Kernel-packages] [Bug 1842465] Re: Watchdog error about hard lockup

2019-09-09 Thread Frank Heimes
** Changed in: ubuntu-power-systems
 Assignee: Canonical Kernel Team (canonical-kernel-team) => Frank Heimes 
(frank-heimes)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1842465

Title:
  Watchdog error about hard lockup

Status in The Ubuntu-power-systems project:
  Confirmed
Status in linux package in Ubuntu:
  Confirmed

Bug description:
  ---Problem Description---
  Got a message from Watchdog about self-detected hard LOCKUP
   
  ---uname output---
  Linux power 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:08:34 UTC 
2019 ppc64le ppc64le ppc64le GNU/Linux
   
  ---Additional Hardware Info---
  Architecture:ppc64le
  Byte Order:  Little Endian
  CPU(s):  128
  On-line CPU(s) list: 0-127
  Thread(s) per core:  4
  Core(s) per socket:  16
  Socket(s):   2
  NUMA node(s):6
  Model:   2.2 (pvr 004e 1202)
  Model name:  POWER9, altivec supported
  CPU max MHz: 3800.
  CPU min MHz: 2300.
  L1d cache:   32K
  L1i cache:   32K
  L2 cache:512K
  L3 cache:10240K
  NUMA node0 CPU(s):   0-63
  NUMA node8 CPU(s):   64-127
  NUMA node252 CPU(s):
  NUMA node253 CPU(s):
  NUMA node254 CPU(s):
  NUMA node255 CPU(s):
  ---
  free
totalusedfree  shared  buff/cache   
available
  Mem: 1071807104 5110016   985192768 622944081504320  
1056273664
  Swap:   2097088   0 2097088
  --
  lsblk
  NAMEMAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
  sda   8:01 894.3G  0 disk
  ??sda18:11 7M  0 part
  ??sda28:21 894.3G  0 part /
  sdb   8:16   1 894.3G  0 disk
  nvme0n1 259:10   2.9T  0 disk /nvmdisk1
  ---
   
  Machine Type = AC922, bare metal 
   
  ---Steps to Reproduce---
   This problem I encountered when running customer workload and I switched SMT 
levels from SMT2 to SMT1 and I got a 
  lockup error right away!! this seems to be a different one... postgresql DB 
daemon was running on the system.
   
  Stack trace output:
   [756383.688067] watchdog: CPU 53 self-detected hard LOCKUP @ 
_raw_spin_lock+0x54/0xe0
  [756383.688068] watchdog: CPU 53 TB:387344180861438, last heartbeat 
TB:387337108856720 (13812ms ago)
  [756383.688069] Modules linked in: binfmt_misc veth ipt_MASQUERADE 
nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 
xt_addrtype iptable_filter bpfilter xt_conntrack nf_nat nf_conntrack 
nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc aufs overlay 
vmx_crypto ofpart cmdlinepart powernv_flash ipmi_powernv opal_prd mtd 
ipmi_devintf at24 ibmpowernv ipmi_msghandler uio_pdrv_genirq uio sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear mlx5_ib ib_uverbs ib_core ast 
crct10dif_vpmsum i2c_algo_bit crc32c_vpmsum ttm mlx5_core drm_kms_helper 
syscopyarea nvme sysfillrect sysimgblt fb_sys_fops drm nvme_core ahci libahci 
tls mlxfw devlink tg3 drm_panel_orientation_quirks
  [756383.688088] CPU: 53 PID: 119744 Comm: postgres Not tainted 
5.0.0-23-generic #24~18.04.1-Ubuntu
  [756383.688088] NIP:  c0e0fcc4 LR: c015fd90 CTR: 
c0600460
  [756383.688089] REGS: c07fffb3bd70 TRAP: 0900   Not tainted  
(5.0.0-23-generic)
  [756383.688089] MSR:  90009033   CR: 
28242824  XER: 
  [756383.688091] CFAR: c0e0fcec IRQMASK: 1 
  [756383.688092] GPR00: c015fd90 c000206f2cdf7970 c185c700 
c00020732ea49100 
  [756383.688093] GPR04: c000206f2cdf7a38  c000206f2cdf7b00 
0001 
  [756383.688095] GPR08: 0003 807d 8035 
fffd 
  [756383.688096] GPR12: 2000 c07c5080 7cde07504dd8 
0f495eee0d68 
  [756383.688097] GPR16: 7fffc0eb2bd7 7fffc0eb2aa0 0f496c289088 
7fffc0eb2a74 
  [756383.688098] GPR20:  0001 0001 
 
  [756383.688099] GPR24:  c000206f2cdf7a38 c1349100 
20732d70 
  [756383.688100] GPR28: c1891c70 c000206f36d8b400 c1895c78 
c00020732ea49100 
  [756383.688102] NIP [c0e0fcc4] _raw_spin_lock+0x54/0xe0
  [756383.688102] LR [c015fd90] __task_rq_lock+0x80/0x150
  [756383.688102] Call Trace:
  [756383.688103] [c000206f2cdf7970] [c000206f2cdf79d0] 0xc000206f2cdf79d0 
(unreliable)
  [756383.688103] [c000206f2cdf79a0] [c07fd3847818] 0xc07fd3847818
  [756383.688104] [c000206f2cdf7a10] [c01649c0] 
try_to_wake_up+0x380/0x710
  [756383.688105] [c000206f2cdf7aa0] [c0164de0] wake_up_q+0x70/0xd0
  [756383.688105] [c000206f2cdf7ae0] 

[Kernel-packages] [Bug 1842465] Re: Watchdog error about hard lockup

2019-09-09 Thread Frank Heimes
The commit mentioned above is already in disco master-next:
~/ubuntu-disco-master-next/ubuntu-disco-clean$ git log --oneline | grep -m 1 
"powerpc/watchdog: Use hrtimers for per-CPU heartbeat"
fad8027 powerpc/watchdog: Use hrtimers for per-CPU heartbeat
but not yet tagged.

** Changed in: linux (Ubuntu)
   Status: New => Confirmed

** Changed in: ubuntu-power-systems
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1842465

Title:
  Watchdog error about hard lockup

Status in The Ubuntu-power-systems project:
  Confirmed
Status in linux package in Ubuntu:
  Confirmed

Bug description:
  ---Problem Description---
  Got a message from Watchdog about self-detected hard LOCKUP
   
  ---uname output---
  Linux power 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:08:34 UTC 
2019 ppc64le ppc64le ppc64le GNU/Linux
   
  ---Additional Hardware Info---
  Architecture:ppc64le
  Byte Order:  Little Endian
  CPU(s):  128
  On-line CPU(s) list: 0-127
  Thread(s) per core:  4
  Core(s) per socket:  16
  Socket(s):   2
  NUMA node(s):6
  Model:   2.2 (pvr 004e 1202)
  Model name:  POWER9, altivec supported
  CPU max MHz: 3800.
  CPU min MHz: 2300.
  L1d cache:   32K
  L1i cache:   32K
  L2 cache:512K
  L3 cache:10240K
  NUMA node0 CPU(s):   0-63
  NUMA node8 CPU(s):   64-127
  NUMA node252 CPU(s):
  NUMA node253 CPU(s):
  NUMA node254 CPU(s):
  NUMA node255 CPU(s):
  ---
  free
totalusedfree  shared  buff/cache   
available
  Mem: 1071807104 5110016   985192768 622944081504320  
1056273664
  Swap:   2097088   0 2097088
  --
  lsblk
  NAMEMAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
  sda   8:01 894.3G  0 disk
  ??sda18:11 7M  0 part
  ??sda28:21 894.3G  0 part /
  sdb   8:16   1 894.3G  0 disk
  nvme0n1 259:10   2.9T  0 disk /nvmdisk1
  ---
   
  Machine Type = AC922, bare metal 
   
  ---Steps to Reproduce---
   This problem I encountered when running customer workload and I switched SMT 
levels from SMT2 to SMT1 and I got a 
  lockup error right away!! this seems to be a different one... postgresql DB 
daemon was running on the system.
   
  Stack trace output:
   [756383.688067] watchdog: CPU 53 self-detected hard LOCKUP @ 
_raw_spin_lock+0x54/0xe0
  [756383.688068] watchdog: CPU 53 TB:387344180861438, last heartbeat 
TB:387337108856720 (13812ms ago)
  [756383.688069] Modules linked in: binfmt_misc veth ipt_MASQUERADE 
nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 
xt_addrtype iptable_filter bpfilter xt_conntrack nf_nat nf_conntrack 
nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc aufs overlay 
vmx_crypto ofpart cmdlinepart powernv_flash ipmi_powernv opal_prd mtd 
ipmi_devintf at24 ibmpowernv ipmi_msghandler uio_pdrv_genirq uio sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear mlx5_ib ib_uverbs ib_core ast 
crct10dif_vpmsum i2c_algo_bit crc32c_vpmsum ttm mlx5_core drm_kms_helper 
syscopyarea nvme sysfillrect sysimgblt fb_sys_fops drm nvme_core ahci libahci 
tls mlxfw devlink tg3 drm_panel_orientation_quirks
  [756383.688088] CPU: 53 PID: 119744 Comm: postgres Not tainted 
5.0.0-23-generic #24~18.04.1-Ubuntu
  [756383.688088] NIP:  c0e0fcc4 LR: c015fd90 CTR: 
c0600460
  [756383.688089] REGS: c07fffb3bd70 TRAP: 0900   Not tainted  
(5.0.0-23-generic)
  [756383.688089] MSR:  90009033   CR: 
28242824  XER: 
  [756383.688091] CFAR: c0e0fcec IRQMASK: 1 
  [756383.688092] GPR00: c015fd90 c000206f2cdf7970 c185c700 
c00020732ea49100 
  [756383.688093] GPR04: c000206f2cdf7a38  c000206f2cdf7b00 
0001 
  [756383.688095] GPR08: 0003 807d 8035 
fffd 
  [756383.688096] GPR12: 2000 c07c5080 7cde07504dd8 
0f495eee0d68 
  [756383.688097] GPR16: 7fffc0eb2bd7 7fffc0eb2aa0 0f496c289088 
7fffc0eb2a74 
  [756383.688098] GPR20:  0001 0001 
 
  [756383.688099] GPR24:  c000206f2cdf7a38 c1349100 
20732d70 
  [756383.688100] GPR28: c1891c70 c000206f36d8b400 c1895c78 
c00020732ea49100 
  [756383.688102] NIP [c0e0fcc4] _raw_spin_lock+0x54/0xe0
  [756383.688102] LR [c015fd90] __task_rq_lock+0x80/0x150
  [756383.688102] Call Trace:
  [756383.688103] [c000206f2cdf7970] [c000206f2cdf79d0] 0xc000206f2cdf79d0 
(unreliable)
  

[Kernel-packages] [Bug 1842465] Re: Watchdog error about hard lockup

2019-09-03 Thread Andrew Cloke
** Also affects: ubuntu-power-systems
   Importance: Undecided
   Status: New

** Changed in: ubuntu-power-systems
 Assignee: (unassigned) => Canonical Kernel Team (canonical-kernel-team)

** Changed in: ubuntu-power-systems
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1842465

Title:
  Watchdog error about hard lockup

Status in The Ubuntu-power-systems project:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  ---Problem Description---
  Got a message from Watchdog about self-detected hard LOCKUP
   
  ---uname output---
  Linux power 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:08:34 UTC 
2019 ppc64le ppc64le ppc64le GNU/Linux
   
  ---Additional Hardware Info---
  Architecture:ppc64le
  Byte Order:  Little Endian
  CPU(s):  128
  On-line CPU(s) list: 0-127
  Thread(s) per core:  4
  Core(s) per socket:  16
  Socket(s):   2
  NUMA node(s):6
  Model:   2.2 (pvr 004e 1202)
  Model name:  POWER9, altivec supported
  CPU max MHz: 3800.
  CPU min MHz: 2300.
  L1d cache:   32K
  L1i cache:   32K
  L2 cache:512K
  L3 cache:10240K
  NUMA node0 CPU(s):   0-63
  NUMA node8 CPU(s):   64-127
  NUMA node252 CPU(s):
  NUMA node253 CPU(s):
  NUMA node254 CPU(s):
  NUMA node255 CPU(s):
  ---
  free
totalusedfree  shared  buff/cache   
available
  Mem: 1071807104 5110016   985192768 622944081504320  
1056273664
  Swap:   2097088   0 2097088
  --
  lsblk
  NAMEMAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
  sda   8:01 894.3G  0 disk
  ??sda18:11 7M  0 part
  ??sda28:21 894.3G  0 part /
  sdb   8:16   1 894.3G  0 disk
  nvme0n1 259:10   2.9T  0 disk /nvmdisk1
  ---
   
  Machine Type = AC922, bare metal 
   
  ---Steps to Reproduce---
   This problem I encountered when running customer workload and I switched SMT 
levels from SMT2 to SMT1 and I got a 
  lockup error right away!! this seems to be a different one... postgresql DB 
daemon was running on the system.
   
  Stack trace output:
   [756383.688067] watchdog: CPU 53 self-detected hard LOCKUP @ 
_raw_spin_lock+0x54/0xe0
  [756383.688068] watchdog: CPU 53 TB:387344180861438, last heartbeat 
TB:387337108856720 (13812ms ago)
  [756383.688069] Modules linked in: binfmt_misc veth ipt_MASQUERADE 
nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 
xt_addrtype iptable_filter bpfilter xt_conntrack nf_nat nf_conntrack 
nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc aufs overlay 
vmx_crypto ofpart cmdlinepart powernv_flash ipmi_powernv opal_prd mtd 
ipmi_devintf at24 ibmpowernv ipmi_msghandler uio_pdrv_genirq uio sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear mlx5_ib ib_uverbs ib_core ast 
crct10dif_vpmsum i2c_algo_bit crc32c_vpmsum ttm mlx5_core drm_kms_helper 
syscopyarea nvme sysfillrect sysimgblt fb_sys_fops drm nvme_core ahci libahci 
tls mlxfw devlink tg3 drm_panel_orientation_quirks
  [756383.688088] CPU: 53 PID: 119744 Comm: postgres Not tainted 
5.0.0-23-generic #24~18.04.1-Ubuntu
  [756383.688088] NIP:  c0e0fcc4 LR: c015fd90 CTR: 
c0600460
  [756383.688089] REGS: c07fffb3bd70 TRAP: 0900   Not tainted  
(5.0.0-23-generic)
  [756383.688089] MSR:  90009033   CR: 
28242824  XER: 
  [756383.688091] CFAR: c0e0fcec IRQMASK: 1 
  [756383.688092] GPR00: c015fd90 c000206f2cdf7970 c185c700 
c00020732ea49100 
  [756383.688093] GPR04: c000206f2cdf7a38  c000206f2cdf7b00 
0001 
  [756383.688095] GPR08: 0003 807d 8035 
fffd 
  [756383.688096] GPR12: 2000 c07c5080 7cde07504dd8 
0f495eee0d68 
  [756383.688097] GPR16: 7fffc0eb2bd7 7fffc0eb2aa0 0f496c289088 
7fffc0eb2a74 
  [756383.688098] GPR20:  0001 0001 
 
  [756383.688099] GPR24:  c000206f2cdf7a38 c1349100 
20732d70 
  [756383.688100] GPR28: c1891c70 c000206f36d8b400 c1895c78 
c00020732ea49100 
  [756383.688102] NIP [c0e0fcc4] _raw_spin_lock+0x54/0xe0
  [756383.688102] LR [c015fd90] __task_rq_lock+0x80/0x150
  [756383.688102] Call Trace:
  [756383.688103] [c000206f2cdf7970] [c000206f2cdf79d0] 0xc000206f2cdf79d0 
(unreliable)
  [756383.688103] [c000206f2cdf79a0] [c07fd3847818] 0xc07fd3847818
  [756383.688104] [c000206f2cdf7a10] [c01649c0]