Public bug reported:

I have a Dell (PowerEdge T110/0V52N7, BIOS 1.6.4 03/02/2011) was running
Ubuntu 16.04 for a while. Ever since I upgraded to 16.10, this problem
started with errors, OOM and an eventual kernel panic. It can run fine
for about 3-4 hours or so. I see the following errors on syslog (also
attached w/ other logs and information I can gather).

Jan  9 07:36:32 gorilla kernel: [69304.099302] NMI watchdog: BUG: soft lockup - 
CPU#1 stuck for 22s! [kswapd0:50]
Jan  9 07:37:00 gorilla kernel: [69332.119587] NMI watchdog: BUG: soft lockup - 
CPU#1 stuck for 22s! [kswapd0:50]
Jan  9 07:37:33 gorilla kernel: [69364.114705] NMI watchdog: BUG: soft lockup - 
CPU#3 stuck for 22s! [kswapd0:50]
Jan  9 07:38:01 gorilla kernel: [69392.127352] NMI watchdog: BUG: soft lockup - 
CPU#3 stuck for 22s! [kswapd0:50]
Jan  9 07:38:37 gorilla kernel: [69428.134132] NMI watchdog: BUG: soft lockup - 
CPU#3 stuck for 22s! [kswapd0:50]
Jan  9 07:39:45 gorilla kernel: [69496.112694] NMI watchdog: BUG: soft lockup - 
CPU#1 stuck for 23s! [kswapd0:50]
Jan  9 07:40:13 gorilla kernel: [69524.112050] NMI watchdog: BUG: soft lockup - 
CPU#1 stuck for 22s! [kswapd0:50]
Jan  9 07:40:49 gorilla kernel: [69560.104511] NMI watchdog: BUG: soft lockup - 
CPU#1 stuck for 22s! [kswapd0:50]
Jan  9 07:41:17 gorilla kernel: [69588.107302] NMI watchdog: BUG: soft lockup - 
CPU#1 stuck for 22s! [kswapd0:50]
Jan  9 07:41:45 gorilla kernel: [69616.104843] NMI watchdog: BUG: soft lockup - 
CPU#1 stuck for 23s! [kswapd0:50]

Jan  8 11:52:27 gorilla kernel: [ 2852.818471] rsync invoked oom-killer: 
gfp_mask=0x26000d0(GFP_TEMPORARY|__GFP_NOTRACK), order=0, oom_score_adj=0
Jan  9 07:38:56 gorilla kernel: [69448.096571] kthreadd invoked oom-killer: 
gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=1, oom_score_adj=0
Jan  9 07:39:46 gorilla kernel: [69497.705922] apache2 invoked oom-killer: 
gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=1, oom_score_adj=0
Jan  9 07:40:50 gorilla kernel: [69561.956773] sh invoked oom-killer: 
gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=1, oom_score_adj=0
Jan  9 07:41:10 gorilla kernel: [69582.329364] rsync invoked oom-killer: 
gfp_mask=0x26000d0(GFP_TEMPORARY|__GFP_NOTRACK), order=0, oom_score_adj=0
Jan  9 07:42:40 gorilla kernel: [69672.181041] sessionclean invoked oom-killer: 
gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=1, oom_score_adj=0
Jan  9 07:42:41 gorilla kernel: [69673.298714] apache2 invoked oom-killer: 
gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=1, oom_score_adj=0
Jan  9 07:42:59 gorilla kernel: [69691.320169] apache2 invoked oom-killer: 
gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=1, oom_score_adj=0
Jan  9 07:43:03 gorilla kernel: [69694.769140] sessionclean invoked oom-killer: 
gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=1, oom_score_adj=0
Jan  9 07:43:20 gorilla kernel: [69712.255535] kthreadd invoked oom-killer: 
gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=1, oom_score_adj=0


Jan  8 11:46:11 gorilla kernel: [ 2476.342532] perf: interrupt took too long 
(2512 > 2500), lowering kernel.perf_event_max_sample_rate to 79500
Jan  8 11:49:04 gorilla kernel: [ 2650.045417] perf: interrupt took too long 
(3147 > 3140), lowering kernel.perf_event_max_sample_rate to 63500
Jan  8 11:49:56 gorilla kernel: [ 2701.973751] perf: interrupt took too long 
(3982 > 3933), lowering kernel.perf_event_max_sample_rate to 50000
Jan  8 11:51:47 gorilla kernel: [ 2812.208307] perf: interrupt took too long 
(4980 > 4977), lowering kernel.perf_event_max_sample_rate to 40000
Jan  8 13:56:06 gorilla kernel: [ 5678.539070] perf: interrupt took too long 
(2513 > 2500), lowering kernel.perf_event_max_sample_rate to 79500
Jan  8 15:59:49 gorilla kernel: [13101.158417] perf: interrupt took too long 
(3148 > 3141), lowering kernel.perf_event_max_sample_rate to 63500
Jan  9 02:15:54 gorilla kernel: [50065.939132] perf: interrupt took too long 
(3942 > 3935), lowering kernel.perf_event_max_sample_rate to 50500
Jan  9 07:35:30 gorilla kernel: [69241.742219] perf: interrupt took too long 
(4932 > 4927), lowering kernel.perf_event_max_sample_rate to 40500
Jan  9 07:35:54 gorilla kernel: [69265.928531] perf: interrupt took too long 
(6170 > 6165), lowering kernel.perf_event_max_sample_rate to 32250
Jan  9 07:36:53 gorilla kernel: [69325.386696] perf: interrupt took too long 
(7723 > 7712), lowering kernel.perf_event_max_sample_rate to 25750


Just to make sure if this is not memory related, I ran memtest for 12 passes 
over night and found no errors on memory. Removed the external backup drives to 
isolate the problem. Checked similar issues on lanchpad.net but most of them 
are related to video driver and power supply.

Appreciate help.

Thanks
-Arul

Attachments:
------------

syslog
uname.txt
swaps.txt
dmesg.txt
df.txt
lspci.txt
lsusb.txt
meminfo.txt
cpuinfo.txt

** Affects: ubuntu
     Importance: Undecided
         Status: New

** Attachment added: "syslog"
   https://bugs.launchpad.net/bugs/1655356/+attachment/4802356/+files/syslog

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1655356

Title:
  NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kswapd0:50];
  oom-killer; and eventual kernel panic on 16.10 (upgrade from 16.04)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+bug/1655356/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to