[Kernel-packages] [Bug 1757402] Re: Ubuntu18.04:pKVM - Host in hung state and out of network after few hours of stress run on all guests
** Changed in: linux (Ubuntu Artful) Status: Incomplete => Won't Fix ** Changed in: ubuntu-power-systems Status: Incomplete => Fix Released -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1757402 Title: Ubuntu18.04:pKVM - Host in hung state and out of network after few hours of stress run on all guests Status in The Ubuntu-power-systems project: Fix Released Status in linux package in Ubuntu: Fix Released Status in linux source package in Artful: Won't Fix Status in linux source package in Bionic: Fix Released Bug description: == Comment: #0 - INDIRA P. JOGA - 2018-02-11 12:37:25 == Problem Description: === After few hours of run system is in hung state with, "rcu_sched detected stalls on CPUs/tasks" messages on the host IPMI console and host is out of network . Steps to re-create: == > Installed Ubuntu1804 on boslcp3 host. root@boslcp3:~# uname -a Linux boslcp3 4.13.0-25-generic #29-Ubuntu SMP Mon Jan 8 21:15:55 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux root@boslcp3:~# uname -r 4.13.0-25-generic > root@boslcp3:~# ppc64_cpu --smt SMT is off > Hugepage set up echo 8500 > /proc/sys/vm/nr_hugepages > Defined the guests from host machine IdName State 2 boslcp3g2 shut off 3 boslcp3g3 shut off 4 boslcp3g4 shut off 6 boslcp3g1 shut off 7 boslcp3g5 shut off > Started and installed ubuntu1804 daily build on all the guests. root@boslcp3:~# virsh list --all IdName State 2 boslcp3g2 running 3 boslcp3g3 running 4 boslcp3g4 running 6 boslcp3g1 running 7 boslcp3g5 running > Started regression run (IO_BASE_TCP_NFS) tests on all 5 guests. NOTE: Removed madvise test case from BASE focus areas. > Run went fine for few hours on all guests. > After few hours of run ,Host system is in hung state and host console dumps CPU stall messages as below [SOL Session operational. Use ~? for help] [250867.133429] INFO: rcu_sched detected stalls on CPUs/tasks: [250867.133499] (detected by 86, t=62711832 jiffies, g=497, c=496, q=31987857) [250867.133554] All QSes seen, last rcu_sched kthread activity 62711828 (4357609080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250867.133690] rcu_sched kthread starved for 62711828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 [250931.133433] INFO: rcu_sched detected stalls on CPUs/tasks: [250931.133494] (detected by 3, t=62727832 jiffies, g=497, c=496, q=31995625) [250931.133572] All QSes seen, last rcu_sched kthread activity 62727828 (4357625080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250931.133741] rcu_sched kthread starved for 62727828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 [250995.133432] INFO: rcu_sched detected stalls on CPUs/tasks: [250995.133480] (detected by 54, t=62743832 jiffies, g=497, c=496, q=32004479) [250995.133526] All QSes seen, last rcu_sched kthread activity 62743828 (4357641080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250995.133645] rcu_sched kthread starved for 62743828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 > Not able to get the prompt > Ping /shh to boslcp3 also fails [ipjoga@kte ~]$ ping boslcp3 PING boslcp3.isst.aus.stglabs.ibm.com (10.33.0.157) 56(84) bytes of data. From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=1 Destination Host Unreachable From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=2 Destination Host Unreachable From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=3 Destination Host Unreachable [ipjoga@kte ~]$ ssh root@boslcp3 ssh: connect to host boslcp3 port 22: No route to host > boslcp3 is not reachable > Attached boslcp3 host console logs == Comment: #1 - INDIRA P. JOGA - 2018-02-11 12:39:29 == Added Host console logs == Comment: #24 - VIPIN K. PARASHAR - 2018-02-16 05:46:13 == From Linux logs === [72072.290071] watchdog: BUG: soft lockup - CPU#132 stuck for 22s! [CPU 12/KVM:15579] [72072.290218] CPU: 132 PID: 15579 Comm: CPU 12/KVM Tainted: GWL 4.13.0-32-generic #35-Ubuntu [72072.290220] task: c000200debf82e00 task.stack: c000200e140f8000 [72072.290221] NIP: c0c779e0 LR: c008166893a0 CTR: c0c77980 [72072.290223] REGS: c000200e140fb790 TRAP: 0901 Tainted: GWL (4.13.0-32-generic) [72072.290224]
[Kernel-packages] [Bug 1757402] Re: Ubuntu18.04:pKVM - Host in hung state and out of network after few hours of stress run on all guests
** Changed in: linux (Ubuntu Bionic) Status: Fix Committed => Fix Released ** Changed in: ubuntu-power-systems Status: Triaged => Incomplete -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1757402 Title: Ubuntu18.04:pKVM - Host in hung state and out of network after few hours of stress run on all guests Status in The Ubuntu-power-systems project: Incomplete Status in linux package in Ubuntu: Fix Released Status in linux source package in Artful: Incomplete Status in linux source package in Bionic: Fix Released Bug description: == Comment: #0 - INDIRA P. JOGA- 2018-02-11 12:37:25 == Problem Description: === After few hours of run system is in hung state with, "rcu_sched detected stalls on CPUs/tasks" messages on the host IPMI console and host is out of network . Steps to re-create: == > Installed Ubuntu1804 on boslcp3 host. root@boslcp3:~# uname -a Linux boslcp3 4.13.0-25-generic #29-Ubuntu SMP Mon Jan 8 21:15:55 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux root@boslcp3:~# uname -r 4.13.0-25-generic > root@boslcp3:~# ppc64_cpu --smt SMT is off > Hugepage set up echo 8500 > /proc/sys/vm/nr_hugepages > Defined the guests from host machine IdName State 2 boslcp3g2 shut off 3 boslcp3g3 shut off 4 boslcp3g4 shut off 6 boslcp3g1 shut off 7 boslcp3g5 shut off > Started and installed ubuntu1804 daily build on all the guests. root@boslcp3:~# virsh list --all IdName State 2 boslcp3g2 running 3 boslcp3g3 running 4 boslcp3g4 running 6 boslcp3g1 running 7 boslcp3g5 running > Started regression run (IO_BASE_TCP_NFS) tests on all 5 guests. NOTE: Removed madvise test case from BASE focus areas. > Run went fine for few hours on all guests. > After few hours of run ,Host system is in hung state and host console dumps CPU stall messages as below [SOL Session operational. Use ~? for help] [250867.133429] INFO: rcu_sched detected stalls on CPUs/tasks: [250867.133499] (detected by 86, t=62711832 jiffies, g=497, c=496, q=31987857) [250867.133554] All QSes seen, last rcu_sched kthread activity 62711828 (4357609080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250867.133690] rcu_sched kthread starved for 62711828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 [250931.133433] INFO: rcu_sched detected stalls on CPUs/tasks: [250931.133494] (detected by 3, t=62727832 jiffies, g=497, c=496, q=31995625) [250931.133572] All QSes seen, last rcu_sched kthread activity 62727828 (4357625080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250931.133741] rcu_sched kthread starved for 62727828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 [250995.133432] INFO: rcu_sched detected stalls on CPUs/tasks: [250995.133480] (detected by 54, t=62743832 jiffies, g=497, c=496, q=32004479) [250995.133526] All QSes seen, last rcu_sched kthread activity 62743828 (4357641080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250995.133645] rcu_sched kthread starved for 62743828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 > Not able to get the prompt > Ping /shh to boslcp3 also fails [ipjoga@kte ~]$ ping boslcp3 PING boslcp3.isst.aus.stglabs.ibm.com (10.33.0.157) 56(84) bytes of data. From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=1 Destination Host Unreachable From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=2 Destination Host Unreachable From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=3 Destination Host Unreachable [ipjoga@kte ~]$ ssh root@boslcp3 ssh: connect to host boslcp3 port 22: No route to host > boslcp3 is not reachable > Attached boslcp3 host console logs == Comment: #1 - INDIRA P. JOGA - 2018-02-11 12:39:29 == Added Host console logs == Comment: #24 - VIPIN K. PARASHAR - 2018-02-16 05:46:13 == From Linux logs === [72072.290071] watchdog: BUG: soft lockup - CPU#132 stuck for 22s! [CPU 12/KVM:15579] [72072.290218] CPU: 132 PID: 15579 Comm: CPU 12/KVM Tainted: GWL 4.13.0-32-generic #35-Ubuntu [72072.290220] task: c000200debf82e00 task.stack: c000200e140f8000 [72072.290221] NIP: c0c779e0 LR: c008166893a0 CTR: c0c77980 [72072.290223] REGS: c000200e140fb790 TRAP: 0901
[Kernel-packages] [Bug 1757402] Re: Ubuntu18.04:pKVM - Host in hung state and out of network after few hours of stress run on all guests
** Changed in: linux (Ubuntu Artful) Status: New => Incomplete -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1757402 Title: Ubuntu18.04:pKVM - Host in hung state and out of network after few hours of stress run on all guests Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: Fix Committed Status in linux source package in Artful: Incomplete Status in linux source package in Bionic: Fix Committed Bug description: == Comment: #0 - INDIRA P. JOGA- 2018-02-11 12:37:25 == Problem Description: === After few hours of run system is in hung state with, "rcu_sched detected stalls on CPUs/tasks" messages on the host IPMI console and host is out of network . Steps to re-create: == > Installed Ubuntu1804 on boslcp3 host. root@boslcp3:~# uname -a Linux boslcp3 4.13.0-25-generic #29-Ubuntu SMP Mon Jan 8 21:15:55 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux root@boslcp3:~# uname -r 4.13.0-25-generic > root@boslcp3:~# ppc64_cpu --smt SMT is off > Hugepage set up echo 8500 > /proc/sys/vm/nr_hugepages > Defined the guests from host machine IdName State 2 boslcp3g2 shut off 3 boslcp3g3 shut off 4 boslcp3g4 shut off 6 boslcp3g1 shut off 7 boslcp3g5 shut off > Started and installed ubuntu1804 daily build on all the guests. root@boslcp3:~# virsh list --all IdName State 2 boslcp3g2 running 3 boslcp3g3 running 4 boslcp3g4 running 6 boslcp3g1 running 7 boslcp3g5 running > Started regression run (IO_BASE_TCP_NFS) tests on all 5 guests. NOTE: Removed madvise test case from BASE focus areas. > Run went fine for few hours on all guests. > After few hours of run ,Host system is in hung state and host console dumps CPU stall messages as below [SOL Session operational. Use ~? for help] [250867.133429] INFO: rcu_sched detected stalls on CPUs/tasks: [250867.133499] (detected by 86, t=62711832 jiffies, g=497, c=496, q=31987857) [250867.133554] All QSes seen, last rcu_sched kthread activity 62711828 (4357609080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250867.133690] rcu_sched kthread starved for 62711828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 [250931.133433] INFO: rcu_sched detected stalls on CPUs/tasks: [250931.133494] (detected by 3, t=62727832 jiffies, g=497, c=496, q=31995625) [250931.133572] All QSes seen, last rcu_sched kthread activity 62727828 (4357625080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250931.133741] rcu_sched kthread starved for 62727828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 [250995.133432] INFO: rcu_sched detected stalls on CPUs/tasks: [250995.133480] (detected by 54, t=62743832 jiffies, g=497, c=496, q=32004479) [250995.133526] All QSes seen, last rcu_sched kthread activity 62743828 (4357641080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250995.133645] rcu_sched kthread starved for 62743828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 > Not able to get the prompt > Ping /shh to boslcp3 also fails [ipjoga@kte ~]$ ping boslcp3 PING boslcp3.isst.aus.stglabs.ibm.com (10.33.0.157) 56(84) bytes of data. From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=1 Destination Host Unreachable From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=2 Destination Host Unreachable From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=3 Destination Host Unreachable [ipjoga@kte ~]$ ssh root@boslcp3 ssh: connect to host boslcp3 port 22: No route to host > boslcp3 is not reachable > Attached boslcp3 host console logs == Comment: #1 - INDIRA P. JOGA - 2018-02-11 12:39:29 == Added Host console logs == Comment: #24 - VIPIN K. PARASHAR - 2018-02-16 05:46:13 == From Linux logs === [72072.290071] watchdog: BUG: soft lockup - CPU#132 stuck for 22s! [CPU 12/KVM:15579] [72072.290218] CPU: 132 PID: 15579 Comm: CPU 12/KVM Tainted: GWL 4.13.0-32-generic #35-Ubuntu [72072.290220] task: c000200debf82e00 task.stack: c000200e140f8000 [72072.290221] NIP: c0c779e0 LR: c008166893a0 CTR: c0c77980 [72072.290223] REGS: c000200e140fb790 TRAP: 0901 Tainted: GWL (4.13.0-32-generic) [72072.290224] MSR:
[Kernel-packages] [Bug 1757402] Re: Ubuntu18.04:pKVM - Host in hung state and out of network after few hours of stress run on all guests
** Changed in: linux (Ubuntu Artful) Assignee: (unassigned) => Canonical Kernel Team (canonical-kernel-team) ** Changed in: linux (Ubuntu Artful) Importance: Undecided => High -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1757402 Title: Ubuntu18.04:pKVM - Host in hung state and out of network after few hours of stress run on all guests Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: Fix Committed Status in linux source package in Artful: New Status in linux source package in Bionic: Fix Committed Bug description: == Comment: #0 - INDIRA P. JOGA- 2018-02-11 12:37:25 == Problem Description: === After few hours of run system is in hung state with, "rcu_sched detected stalls on CPUs/tasks" messages on the host IPMI console and host is out of network . Steps to re-create: == > Installed Ubuntu1804 on boslcp3 host. root@boslcp3:~# uname -a Linux boslcp3 4.13.0-25-generic #29-Ubuntu SMP Mon Jan 8 21:15:55 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux root@boslcp3:~# uname -r 4.13.0-25-generic > root@boslcp3:~# ppc64_cpu --smt SMT is off > Hugepage set up echo 8500 > /proc/sys/vm/nr_hugepages > Defined the guests from host machine IdName State 2 boslcp3g2 shut off 3 boslcp3g3 shut off 4 boslcp3g4 shut off 6 boslcp3g1 shut off 7 boslcp3g5 shut off > Started and installed ubuntu1804 daily build on all the guests. root@boslcp3:~# virsh list --all IdName State 2 boslcp3g2 running 3 boslcp3g3 running 4 boslcp3g4 running 6 boslcp3g1 running 7 boslcp3g5 running > Started regression run (IO_BASE_TCP_NFS) tests on all 5 guests. NOTE: Removed madvise test case from BASE focus areas. > Run went fine for few hours on all guests. > After few hours of run ,Host system is in hung state and host console dumps CPU stall messages as below [SOL Session operational. Use ~? for help] [250867.133429] INFO: rcu_sched detected stalls on CPUs/tasks: [250867.133499] (detected by 86, t=62711832 jiffies, g=497, c=496, q=31987857) [250867.133554] All QSes seen, last rcu_sched kthread activity 62711828 (4357609080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250867.133690] rcu_sched kthread starved for 62711828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 [250931.133433] INFO: rcu_sched detected stalls on CPUs/tasks: [250931.133494] (detected by 3, t=62727832 jiffies, g=497, c=496, q=31995625) [250931.133572] All QSes seen, last rcu_sched kthread activity 62727828 (4357625080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250931.133741] rcu_sched kthread starved for 62727828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 [250995.133432] INFO: rcu_sched detected stalls on CPUs/tasks: [250995.133480] (detected by 54, t=62743832 jiffies, g=497, c=496, q=32004479) [250995.133526] All QSes seen, last rcu_sched kthread activity 62743828 (4357641080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250995.133645] rcu_sched kthread starved for 62743828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 > Not able to get the prompt > Ping /shh to boslcp3 also fails [ipjoga@kte ~]$ ping boslcp3 PING boslcp3.isst.aus.stglabs.ibm.com (10.33.0.157) 56(84) bytes of data. From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=1 Destination Host Unreachable From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=2 Destination Host Unreachable From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=3 Destination Host Unreachable [ipjoga@kte ~]$ ssh root@boslcp3 ssh: connect to host boslcp3 port 22: No route to host > boslcp3 is not reachable > Attached boslcp3 host console logs == Comment: #1 - INDIRA P. JOGA - 2018-02-11 12:39:29 == Added Host console logs == Comment: #24 - VIPIN K. PARASHAR - 2018-02-16 05:46:13 == From Linux logs === [72072.290071] watchdog: BUG: soft lockup - CPU#132 stuck for 22s! [CPU 12/KVM:15579] [72072.290218] CPU: 132 PID: 15579 Comm: CPU 12/KVM Tainted: GWL 4.13.0-32-generic #35-Ubuntu [72072.290220] task: c000200debf82e00 task.stack: c000200e140f8000 [72072.290221] NIP: c0c779e0 LR: c008166893a0 CTR: c0c77980 [72072.290223] REGS:
[Kernel-packages] [Bug 1757402] Re: Ubuntu18.04:pKVM - Host in hung state and out of network after few hours of stress run on all guests
This has been committed to Bionic, as part of another bug. However, it looks like this would affect Artful as well. Would you be able to confirm and test a patched kernel for Artful? Thanks. Cascardo. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1757402 Title: Ubuntu18.04:pKVM - Host in hung state and out of network after few hours of stress run on all guests Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: Fix Committed Status in linux source package in Artful: New Status in linux source package in Bionic: Fix Committed Bug description: == Comment: #0 - INDIRA P. JOGA- 2018-02-11 12:37:25 == Problem Description: === After few hours of run system is in hung state with, "rcu_sched detected stalls on CPUs/tasks" messages on the host IPMI console and host is out of network . Steps to re-create: == > Installed Ubuntu1804 on boslcp3 host. root@boslcp3:~# uname -a Linux boslcp3 4.13.0-25-generic #29-Ubuntu SMP Mon Jan 8 21:15:55 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux root@boslcp3:~# uname -r 4.13.0-25-generic > root@boslcp3:~# ppc64_cpu --smt SMT is off > Hugepage set up echo 8500 > /proc/sys/vm/nr_hugepages > Defined the guests from host machine IdName State 2 boslcp3g2 shut off 3 boslcp3g3 shut off 4 boslcp3g4 shut off 6 boslcp3g1 shut off 7 boslcp3g5 shut off > Started and installed ubuntu1804 daily build on all the guests. root@boslcp3:~# virsh list --all IdName State 2 boslcp3g2 running 3 boslcp3g3 running 4 boslcp3g4 running 6 boslcp3g1 running 7 boslcp3g5 running > Started regression run (IO_BASE_TCP_NFS) tests on all 5 guests. NOTE: Removed madvise test case from BASE focus areas. > Run went fine for few hours on all guests. > After few hours of run ,Host system is in hung state and host console dumps CPU stall messages as below [SOL Session operational. Use ~? for help] [250867.133429] INFO: rcu_sched detected stalls on CPUs/tasks: [250867.133499] (detected by 86, t=62711832 jiffies, g=497, c=496, q=31987857) [250867.133554] All QSes seen, last rcu_sched kthread activity 62711828 (4357609080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250867.133690] rcu_sched kthread starved for 62711828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 [250931.133433] INFO: rcu_sched detected stalls on CPUs/tasks: [250931.133494] (detected by 3, t=62727832 jiffies, g=497, c=496, q=31995625) [250931.133572] All QSes seen, last rcu_sched kthread activity 62727828 (4357625080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250931.133741] rcu_sched kthread starved for 62727828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 [250995.133432] INFO: rcu_sched detected stalls on CPUs/tasks: [250995.133480] (detected by 54, t=62743832 jiffies, g=497, c=496, q=32004479) [250995.133526] All QSes seen, last rcu_sched kthread activity 62743828 (4357641080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250995.133645] rcu_sched kthread starved for 62743828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 > Not able to get the prompt > Ping /shh to boslcp3 also fails [ipjoga@kte ~]$ ping boslcp3 PING boslcp3.isst.aus.stglabs.ibm.com (10.33.0.157) 56(84) bytes of data. From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=1 Destination Host Unreachable From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=2 Destination Host Unreachable From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=3 Destination Host Unreachable [ipjoga@kte ~]$ ssh root@boslcp3 ssh: connect to host boslcp3 port 22: No route to host > boslcp3 is not reachable > Attached boslcp3 host console logs == Comment: #1 - INDIRA P. JOGA - 2018-02-11 12:39:29 == Added Host console logs == Comment: #24 - VIPIN K. PARASHAR - 2018-02-16 05:46:13 == From Linux logs === [72072.290071] watchdog: BUG: soft lockup - CPU#132 stuck for 22s! [CPU 12/KVM:15579] [72072.290218] CPU: 132 PID: 15579 Comm: CPU 12/KVM Tainted: GWL 4.13.0-32-generic #35-Ubuntu [72072.290220] task: c000200debf82e00 task.stack: c000200e140f8000 [72072.290221] NIP: c0c779e0 LR: c008166893a0 CTR: c0c77980
[Kernel-packages] [Bug 1757402] Re: Ubuntu18.04:pKVM - Host in hung state and out of network after few hours of stress run on all guests
** Description changed: == Comment: #0 - INDIRA P. JOGA- 2018-02-11 12:37:25 == Problem Description: === After few hours of run system is in hung state with, "rcu_sched detected stalls on CPUs/tasks" messages on the host IPMI console and host is out of network . Steps to re-create: == > Installed Ubuntu1804 on boslcp3 host. root@boslcp3:~# uname -a Linux boslcp3 4.13.0-25-generic #29-Ubuntu SMP Mon Jan 8 21:15:55 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux root@boslcp3:~# uname -r 4.13.0-25-generic > root@boslcp3:~# ppc64_cpu --smt SMT is off > Hugepage set up echo 8500 > /proc/sys/vm/nr_hugepages > Defined the guests from host machine IdName State - 2 boslcp3g2 shut off - 3 boslcp3g3 shut off - 4 boslcp3g4 shut off - 6 boslcp3g1 shut off - 7 boslcp3g5 shut off - + 2 boslcp3g2 shut off + 3 boslcp3g3 shut off + 4 boslcp3g4 shut off + 6 boslcp3g1 shut off + 7 boslcp3g5 shut off > Started and installed ubuntu1804 daily build on all the guests. root@boslcp3:~# virsh list --all - IdName State + IdName State - 2 boslcp3g2 running - 3 boslcp3g3 running - 4 boslcp3g4 running - 6 boslcp3g1 running - 7 boslcp3g5 running + 2 boslcp3g2 running + 3 boslcp3g3 running + 4 boslcp3g4 running + 6 boslcp3g1 running + 7 boslcp3g5 running > Started regression run (IO_BASE_TCP_NFS) tests on all 5 guests. NOTE: Removed madvise test case from BASE focus areas. > Run went fine for few hours on all guests. > After few hours of run ,Host system is in hung state and host console dumps CPU stall messages as below [SOL Session operational. Use ~? for help] [250867.133429] INFO: rcu_sched detected stalls on CPUs/tasks: [250867.133499] (detected by 86, t=62711832 jiffies, g=497, c=496, q=31987857) [250867.133554] All QSes seen, last rcu_sched kthread activity 62711828 (4357609080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250867.133690] rcu_sched kthread starved for 62711828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 [250931.133433] INFO: rcu_sched detected stalls on CPUs/tasks: [250931.133494] (detected by 3, t=62727832 jiffies, g=497, c=496, q=31995625) [250931.133572] All QSes seen, last rcu_sched kthread activity 62727828 (4357625080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250931.133741] rcu_sched kthread starved for 62727828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 [250995.133432] INFO: rcu_sched detected stalls on CPUs/tasks: [250995.133480] (detected by 54, t=62743832 jiffies, g=497, c=496, q=32004479) [250995.133526] All QSes seen, last rcu_sched kthread activity 62743828 (4357641080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250995.133645] rcu_sched kthread starved for 62743828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 > Not able to get the prompt > Ping /shh to boslcp3 also fails [ipjoga@kte ~]$ ping boslcp3 PING boslcp3.isst.aus.stglabs.ibm.com (10.33.0.157) 56(84) bytes of data. From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=1 Destination Host Unreachable From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=2 Destination Host Unreachable From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=3 Destination Host Unreachable [ipjoga@kte ~]$ ssh root@boslcp3 ssh: connect to host boslcp3 port 22: No route to host > boslcp3 is not reachable > Attached boslcp3 host console logs == Comment: #1 - INDIRA P. JOGA - 2018-02-11 12:39:29 == Added Host console logs == Comment: #24 - VIPIN K. PARASHAR - 2018-02-16 05:46:13 == From Linux logs === [72072.290071] watchdog: BUG: soft lockup - CPU#132 stuck for 22s! [CPU 12/KVM:15579] [72072.290218] CPU: 132 PID: 15579 Comm: CPU 12/KVM Tainted: GWL 4.13.0-32-generic #35-Ubuntu [72072.290220] task: c000200debf82e00 task.stack: c000200e140f8000 [72072.290221] NIP: c0c779e0 LR: c008166893a0 CTR: c0c77980 [72072.290223] REGS: c000200e140fb790 TRAP: 0901 Tainted: GWL (4.13.0-32-generic) [72072.290224]
[Kernel-packages] [Bug 1757402] Re: Ubuntu18.04:pKVM - Host in hung state and out of network after few hours of stress run on all guests
** Changed in: ubuntu-power-systems Status: New => Triaged -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1757402 Title: Ubuntu18.04:pKVM - Host in hung state and out of network after few hours of stress run on all guests Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: Triaged Status in linux source package in Bionic: Triaged Bug description: == Comment: #0 - INDIRA P. JOGA- 2018-02-11 12:37:25 == Problem Description: === After few hours of run system is in hung state with, "rcu_sched detected stalls on CPUs/tasks" messages on the host IPMI console and host is out of network . Steps to re-create: == > Installed Ubuntu1804 on boslcp3 host. root@boslcp3:~# uname -a Linux boslcp3 4.13.0-25-generic #29-Ubuntu SMP Mon Jan 8 21:15:55 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux root@boslcp3:~# uname -r 4.13.0-25-generic > root@boslcp3:~# ppc64_cpu --smt SMT is off > Hugepage set up echo 8500 > /proc/sys/vm/nr_hugepages > Defined the guests from host machine IdName State 2 boslcp3g2 shut off 3 boslcp3g3 shut off 4 boslcp3g4 shut off 6 boslcp3g1 shut off 7 boslcp3g5 shut off > Started and installed ubuntu1804 daily build on all the guests. root@boslcp3:~# virsh list --all IdName State 2 boslcp3g2 running 3 boslcp3g3 running 4 boslcp3g4 running 6 boslcp3g1 running 7 boslcp3g5 running > Started regression run (IO_BASE_TCP_NFS) tests on all 5 guests. NOTE: Removed madvise test case from BASE focus areas. > Run went fine for few hours on all guests. > After few hours of run ,Host system is in hung state and host console dumps CPU stall messages as below [SOL Session operational. Use ~? for help] [250867.133429] INFO: rcu_sched detected stalls on CPUs/tasks: [250867.133499] (detected by 86, t=62711832 jiffies, g=497, c=496, q=31987857) [250867.133554] All QSes seen, last rcu_sched kthread activity 62711828 (4357609080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250867.133690] rcu_sched kthread starved for 62711828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 [250931.133433] INFO: rcu_sched detected stalls on CPUs/tasks: [250931.133494] (detected by 3, t=62727832 jiffies, g=497, c=496, q=31995625) [250931.133572] All QSes seen, last rcu_sched kthread activity 62727828 (4357625080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250931.133741] rcu_sched kthread starved for 62727828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 [250995.133432] INFO: rcu_sched detected stalls on CPUs/tasks: [250995.133480] (detected by 54, t=62743832 jiffies, g=497, c=496, q=32004479) [250995.133526] All QSes seen, last rcu_sched kthread activity 62743828 (4357641080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250995.133645] rcu_sched kthread starved for 62743828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 > Not able to get the prompt > Ping /shh to boslcp3 also fails [ipjoga@kte ~]$ ping boslcp3 PING boslcp3.isst.aus.stglabs.ibm.com (10.33.0.157) 56(84) bytes of data. From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=1 Destination Host Unreachable From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=2 Destination Host Unreachable From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=3 Destination Host Unreachable [ipjoga@kte ~]$ ssh root@boslcp3 ssh: connect to host boslcp3 port 22: No route to host > boslcp3 is not reachable > Attached boslcp3 host console logs == Comment: #1 - INDIRA P. JOGA - 2018-02-11 12:39:29 == Added Host console logs == Comment: #24 - VIPIN K. PARASHAR - 2018-02-16 05:46:13 == From Linux logs === [72072.290071] watchdog: BUG: soft lockup - CPU#132 stuck for 22s! [CPU 12/KVM:15579] [72072.290218] CPU: 132 PID: 15579 Comm: CPU 12/KVM Tainted: GWL 4.13.0-32-generic #35-Ubuntu [72072.290220] task: c000200debf82e00 task.stack: c000200e140f8000 [72072.290221] NIP: c0c779e0 LR: c008166893a0 CTR: c0c77980 [72072.290223] REGS: c000200e140fb790 TRAP: 0901 Tainted: GWL (4.13.0-32-generic) [72072.290224] MSR: 9280b033 [72072.290235] CR:
[Kernel-packages] [Bug 1757402] Re: Ubuntu18.04:pKVM - Host in hung state and out of network after few hours of stress run on all guests
** Changed in: linux (Ubuntu Bionic) Assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) => Canonical Kernel Team (canonical-kernel-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1757402 Title: Ubuntu18.04:pKVM - Host in hung state and out of network after few hours of stress run on all guests Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: Triaged Status in linux source package in Bionic: Triaged Bug description: == Comment: #0 - INDIRA P. JOGA- 2018-02-11 12:37:25 == Problem Description: === After few hours of run system is in hung state with, "rcu_sched detected stalls on CPUs/tasks" messages on the host IPMI console and host is out of network . Steps to re-create: == > Installed Ubuntu1804 on boslcp3 host. root@boslcp3:~# uname -a Linux boslcp3 4.13.0-25-generic #29-Ubuntu SMP Mon Jan 8 21:15:55 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux root@boslcp3:~# uname -r 4.13.0-25-generic > root@boslcp3:~# ppc64_cpu --smt SMT is off > Hugepage set up echo 8500 > /proc/sys/vm/nr_hugepages > Defined the guests from host machine IdName State 2 boslcp3g2 shut off 3 boslcp3g3 shut off 4 boslcp3g4 shut off 6 boslcp3g1 shut off 7 boslcp3g5 shut off > Started and installed ubuntu1804 daily build on all the guests. root@boslcp3:~# virsh list --all IdName State 2 boslcp3g2 running 3 boslcp3g3 running 4 boslcp3g4 running 6 boslcp3g1 running 7 boslcp3g5 running > Started regression run (IO_BASE_TCP_NFS) tests on all 5 guests. NOTE: Removed madvise test case from BASE focus areas. > Run went fine for few hours on all guests. > After few hours of run ,Host system is in hung state and host console dumps CPU stall messages as below [SOL Session operational. Use ~? for help] [250867.133429] INFO: rcu_sched detected stalls on CPUs/tasks: [250867.133499] (detected by 86, t=62711832 jiffies, g=497, c=496, q=31987857) [250867.133554] All QSes seen, last rcu_sched kthread activity 62711828 (4357609080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250867.133690] rcu_sched kthread starved for 62711828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 [250931.133433] INFO: rcu_sched detected stalls on CPUs/tasks: [250931.133494] (detected by 3, t=62727832 jiffies, g=497, c=496, q=31995625) [250931.133572] All QSes seen, last rcu_sched kthread activity 62727828 (4357625080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250931.133741] rcu_sched kthread starved for 62727828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 [250995.133432] INFO: rcu_sched detected stalls on CPUs/tasks: [250995.133480] (detected by 54, t=62743832 jiffies, g=497, c=496, q=32004479) [250995.133526] All QSes seen, last rcu_sched kthread activity 62743828 (4357641080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250995.133645] rcu_sched kthread starved for 62743828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 > Not able to get the prompt > Ping /shh to boslcp3 also fails [ipjoga@kte ~]$ ping boslcp3 PING boslcp3.isst.aus.stglabs.ibm.com (10.33.0.157) 56(84) bytes of data. From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=1 Destination Host Unreachable From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=2 Destination Host Unreachable From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=3 Destination Host Unreachable [ipjoga@kte ~]$ ssh root@boslcp3 ssh: connect to host boslcp3 port 22: No route to host > boslcp3 is not reachable > Attached boslcp3 host console logs == Comment: #1 - INDIRA P. JOGA - 2018-02-11 12:39:29 == Added Host console logs == Comment: #24 - VIPIN K. PARASHAR - 2018-02-16 05:46:13 == From Linux logs === [72072.290071] watchdog: BUG: soft lockup - CPU#132 stuck for 22s! [CPU 12/KVM:15579] [72072.290218] CPU: 132 PID: 15579 Comm: CPU 12/KVM Tainted: GWL 4.13.0-32-generic #35-Ubuntu [72072.290220] task: c000200debf82e00 task.stack: c000200e140f8000 [72072.290221] NIP: c0c779e0 LR: c008166893a0 CTR: c0c77980 [72072.290223] REGS: c000200e140fb790 TRAP: 0901 Tainted: GWL (4.13.0-32-generic)
[Kernel-packages] [Bug 1757402] Re: Ubuntu18.04:pKVM - Host in hung state and out of network after few hours of stress run on all guests
** Changed in: linux (Ubuntu) Status: New => Triaged ** Changed in: linux (Ubuntu) Importance: Undecided => Critical ** Also affects: linux (Ubuntu Bionic) Importance: Critical Assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) Status: Triaged -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1757402 Title: Ubuntu18.04:pKVM - Host in hung state and out of network after few hours of stress run on all guests Status in The Ubuntu-power-systems project: New Status in linux package in Ubuntu: Triaged Status in linux source package in Bionic: Triaged Bug description: == Comment: #0 - INDIRA P. JOGA- 2018-02-11 12:37:25 == Problem Description: === After few hours of run system is in hung state with, "rcu_sched detected stalls on CPUs/tasks" messages on the host IPMI console and host is out of network . Steps to re-create: == > Installed Ubuntu1804 on boslcp3 host. root@boslcp3:~# uname -a Linux boslcp3 4.13.0-25-generic #29-Ubuntu SMP Mon Jan 8 21:15:55 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux root@boslcp3:~# uname -r 4.13.0-25-generic > root@boslcp3:~# ppc64_cpu --smt SMT is off > Hugepage set up echo 8500 > /proc/sys/vm/nr_hugepages > Defined the guests from host machine IdName State 2 boslcp3g2 shut off 3 boslcp3g3 shut off 4 boslcp3g4 shut off 6 boslcp3g1 shut off 7 boslcp3g5 shut off > Started and installed ubuntu1804 daily build on all the guests. root@boslcp3:~# virsh list --all IdName State 2 boslcp3g2 running 3 boslcp3g3 running 4 boslcp3g4 running 6 boslcp3g1 running 7 boslcp3g5 running > Started regression run (IO_BASE_TCP_NFS) tests on all 5 guests. NOTE: Removed madvise test case from BASE focus areas. > Run went fine for few hours on all guests. > After few hours of run ,Host system is in hung state and host console dumps CPU stall messages as below [SOL Session operational. Use ~? for help] [250867.133429] INFO: rcu_sched detected stalls on CPUs/tasks: [250867.133499] (detected by 86, t=62711832 jiffies, g=497, c=496, q=31987857) [250867.133554] All QSes seen, last rcu_sched kthread activity 62711828 (4357609080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250867.133690] rcu_sched kthread starved for 62711828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 [250931.133433] INFO: rcu_sched detected stalls on CPUs/tasks: [250931.133494] (detected by 3, t=62727832 jiffies, g=497, c=496, q=31995625) [250931.133572] All QSes seen, last rcu_sched kthread activity 62727828 (4357625080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250931.133741] rcu_sched kthread starved for 62727828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 [250995.133432] INFO: rcu_sched detected stalls on CPUs/tasks: [250995.133480] (detected by 54, t=62743832 jiffies, g=497, c=496, q=32004479) [250995.133526] All QSes seen, last rcu_sched kthread activity 62743828 (4357641080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250995.133645] rcu_sched kthread starved for 62743828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 > Not able to get the prompt > Ping /shh to boslcp3 also fails [ipjoga@kte ~]$ ping boslcp3 PING boslcp3.isst.aus.stglabs.ibm.com (10.33.0.157) 56(84) bytes of data. From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=1 Destination Host Unreachable From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=2 Destination Host Unreachable From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=3 Destination Host Unreachable [ipjoga@kte ~]$ ssh root@boslcp3 ssh: connect to host boslcp3 port 22: No route to host > boslcp3 is not reachable > Attached boslcp3 host console logs == Comment: #1 - INDIRA P. JOGA - 2018-02-11 12:39:29 == Added Host console logs == Comment: #24 - VIPIN K. PARASHAR - 2018-02-16 05:46:13 == From Linux logs === [72072.290071] watchdog: BUG: soft lockup - CPU#132 stuck for 22s! [CPU 12/KVM:15579] [72072.290218] CPU: 132 PID: 15579 Comm: CPU 12/KVM Tainted: GWL 4.13.0-32-generic #35-Ubuntu [72072.290220] task: c000200debf82e00 task.stack: c000200e140f8000 [72072.290221] NIP: c0c779e0 LR: c008166893a0 CTR:
[Kernel-packages] [Bug 1757402] Re: Ubuntu18.04:pKVM - Host in hung state and out of network after few hours of stress run on all guests
** Also affects: ubuntu-power-systems Importance: Undecided Status: New ** Changed in: ubuntu-power-systems Importance: Undecided => Critical ** Changed in: ubuntu-power-systems Assignee: (unassigned) => Canonical Kernel Team (canonical-kernel-team) ** Tags added: triage-g -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1757402 Title: Ubuntu18.04:pKVM - Host in hung state and out of network after few hours of stress run on all guests Status in The Ubuntu-power-systems project: New Status in linux package in Ubuntu: New Bug description: == Comment: #0 - INDIRA P. JOGA- 2018-02-11 12:37:25 == Problem Description: === After few hours of run system is in hung state with, "rcu_sched detected stalls on CPUs/tasks" messages on the host IPMI console and host is out of network . Steps to re-create: == > Installed Ubuntu1804 on boslcp3 host. root@boslcp3:~# uname -a Linux boslcp3 4.13.0-25-generic #29-Ubuntu SMP Mon Jan 8 21:15:55 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux root@boslcp3:~# uname -r 4.13.0-25-generic > root@boslcp3:~# ppc64_cpu --smt SMT is off > Hugepage set up echo 8500 > /proc/sys/vm/nr_hugepages > Defined the guests from host machine IdName State 2 boslcp3g2 shut off 3 boslcp3g3 shut off 4 boslcp3g4 shut off 6 boslcp3g1 shut off 7 boslcp3g5 shut off > Started and installed ubuntu1804 daily build on all the guests. root@boslcp3:~# virsh list --all IdName State 2 boslcp3g2 running 3 boslcp3g3 running 4 boslcp3g4 running 6 boslcp3g1 running 7 boslcp3g5 running > Started regression run (IO_BASE_TCP_NFS) tests on all 5 guests. NOTE: Removed madvise test case from BASE focus areas. > Run went fine for few hours on all guests. > After few hours of run ,Host system is in hung state and host console dumps CPU stall messages as below [SOL Session operational. Use ~? for help] [250867.133429] INFO: rcu_sched detected stalls on CPUs/tasks: [250867.133499] (detected by 86, t=62711832 jiffies, g=497, c=496, q=31987857) [250867.133554] All QSes seen, last rcu_sched kthread activity 62711828 (4357609080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250867.133690] rcu_sched kthread starved for 62711828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 [250931.133433] INFO: rcu_sched detected stalls on CPUs/tasks: [250931.133494] (detected by 3, t=62727832 jiffies, g=497, c=496, q=31995625) [250931.133572] All QSes seen, last rcu_sched kthread activity 62727828 (4357625080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250931.133741] rcu_sched kthread starved for 62727828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 [250995.133432] INFO: rcu_sched detected stalls on CPUs/tasks: [250995.133480] (detected by 54, t=62743832 jiffies, g=497, c=496, q=32004479) [250995.133526] All QSes seen, last rcu_sched kthread activity 62743828 (4357641080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0 [250995.133645] rcu_sched kthread starved for 62743828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100 > Not able to get the prompt > Ping /shh to boslcp3 also fails [ipjoga@kte ~]$ ping boslcp3 PING boslcp3.isst.aus.stglabs.ibm.com (10.33.0.157) 56(84) bytes of data. From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=1 Destination Host Unreachable From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=2 Destination Host Unreachable From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=3 Destination Host Unreachable [ipjoga@kte ~]$ ssh root@boslcp3 ssh: connect to host boslcp3 port 22: No route to host > boslcp3 is not reachable > Attached boslcp3 host console logs == Comment: #1 - INDIRA P. JOGA - 2018-02-11 12:39:29 == Added Host console logs == Comment: #24 - VIPIN K. PARASHAR - 2018-02-16 05:46:13 == From Linux logs === [72072.290071] watchdog: BUG: soft lockup - CPU#132 stuck for 22s! [CPU 12/KVM:15579] [72072.290218] CPU: 132 PID: 15579 Comm: CPU 12/KVM Tainted: GWL 4.13.0-32-generic #35-Ubuntu [72072.290220] task: c000200debf82e00 task.stack: c000200e140f8000 [72072.290221] NIP: c0c779e0 LR: c008166893a0 CTR: c0c77980 [72072.290223] REGS: