I started experiencing this very issue about 3 months ago.  My server,
Dell XPS 8700, would lock up repeatedly over the course of a day but
stay "active", meaning my email server wouldn't crash, for a week
sometimes even though I could not log in.  This server is at the latest
standard upgrades of 14.04 LTS, so whatever kernel version that is.  I
can say this based on trying to troubleshoot via the internet's
recommendations, I have replaced the PSU, to no avail.  The thing that
appears to have solved the issue is to remove the video card, thus
rendering the nouveu driver inoperable.  In the past four days I have
not seen one NMI Watchdog Soft Lockup error in my syslog.  Granted I
have to run my video through the built in video card, I am fine with
that.

I know there is a lot of logic in an OS kernel and I am just one case
but perhaps this will provide some insight when troubleshooting this
issue.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1530405

Title:
  NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kerneloops:814]

Status in linux package in Ubuntu:
  Triaged

Bug description:
  I'm using Ubuntu Xenial 16.04 and my computer (ASUS M32BF) will
  randomly freeze up, sometimes before the login screen, sometimes while
  I'm in the middle of using a program. This sometimes happens on the
  Wily 15.10 live cd as well, and on both kernel 4.3.0-2, and kernel
  4.2.0-22.

  Important part of log:

  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112129] NMI watchdog: BUG: soft 
lockup - CPU#0 stuck for 22s! [kerneloops:814]
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112134] Modules linked in: rfcomm 
bnep nls_iso8859_1 kvm_amd kvm eeepc_wmi asus_wmi crct10dif_pclmul 
sparse_keymap crc32_pclmul aesni_intel aes_x86_64 arc4 lrw gf128mul rtl8821ae 
glue_helper snd_hda_codec_realtek ablk_helper snd_hda_codec_generic 
snd_hda_codec_hdmi snd_hda_intel btcoexist snd_hda_codec rtl_pci snd_hda_core 
joydev input_leds snd_hwdep rtlwifi snd_pcm fam15h_power cryptd snd_seq_midi 
serio_raw snd_seq_midi_event snd_rawmidi mac80211 snd_seq snd_seq_device 
snd_timer cfg80211 btusb btrtl btbcm btintel bluetooth snd soundcore 
edac_mce_amd k10temp edac_core i2c_piix4 shpchp mac_hid parport_pc ppdev lp 
parport autofs4 hid_logitech_hidpp uas usb_storage hid_logitech_dj usbhid hid 
amdkfd amd_iommu_v2 radeon i2c_algo_bit ttm drm_kms_helper syscopyarea 
sysfillrect sysimgblt fb_sys_fops r8169 psmouse mii drm ahci libahci wmi fjes 
video
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112183] CPU: 0 PID: 814 Comm: 
kerneloops Not tainted 4.3.0-2-generic #11-Ubuntu
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112184] Hardware name: ASUSTeK 
COMPUTER INC. K30BF_M32BF_A_F_K31BF_6/K30BF_M32BF_A_F_K31BF_6, BIOS 0501 
07/09/2015
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112186] task: ffff88031146d400 
ti: ffff88030e838000 task.ti: ffff88030e838000
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112188] RIP: 
0010:[<ffffffff810819d6>]  [<ffffffff810819d6>] __do_softirq+0x76/0x250
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112194] RSP: 
0018:ffff88031fc03f30  EFLAGS: 00000202
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112196] RAX: ffff88030e83c000 
RBX: 0000000000000000 RCX: 0000000040400040
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112197] RDX: 0000000000000000 
RSI: 000000000000613e RDI: 0000000000000380
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112198] RBP: ffff88031fc03f80 
R08: 00000029f8fa1411 R09: ffff88031fc169f0
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112199] R10: 0000000000000020 
R11: 0000000000000004 R12: ffff88031fc169c0
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112201] R13: ffff88030df8e200 
R14: 0000000000000000 R15: 0000000000000001
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112202] FS:  
00007f6c266ac880(0000) GS:ffff88031fc00000(0000) knlGS:0000000000000000
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112203] CS:  0010 DS: 0000 ES: 
0000 CR0: 0000000080050033
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112205] CR2: 00007ffef000bff8 
CR3: 000000030f368000 CR4: 00000000000406f0
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112206] Stack:
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112207]  404000401fc03f78 
ffff88030e83c000 00000000ffff0121 ffff88030000000a
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112209]  000000021fc0d640 
0000000000000000 ffff88031fc169c0 ffff88030df8e200
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112211]  0000000000000000 
0000000000000001 ffff88031fc03f90 ffffffff81081d23
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112213] Call Trace:
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112215]  <IRQ> 
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112219]  [<ffffffff81081d23>] 
irq_exit+0xa3/0xb0
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112222]  [<ffffffff817fda02>] 
smp_apic_timer_interrupt+0x42/0x50
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112225]  [<ffffffff817fb862>] 
apic_timer_interrupt+0x82/0x90
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112226]  <EOI> 
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112230]  [<ffffffff810a5457>] ? 
finish_task_switch+0x67/0x1c0
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112232]  [<ffffffff817f645c>] 
__schedule+0x36c/0x980
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112234]  [<ffffffff817f6aa3>] 
schedule+0x33/0x80
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112236]  [<ffffffff817f9e4f>] 
do_nanosleep+0x6f/0xf0
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112239]  [<ffffffff810ea59c>] 
hrtimer_nanosleep+0xdc/0x1f0
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112241]  [<ffffffff810e9500>] ? 
__hrtimer_init+0x90/0x90
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112243]  [<ffffffff817f9e3a>] ? 
do_nanosleep+0x5a/0xf0
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112245]  [<ffffffff810ea72a>] 
SyS_nanosleep+0x7a/0x90
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112247]  [<ffffffff817faaf2>] 
entry_SYSCALL_64_fastpath+0x16/0x71
  Dec 31 19:13:12 COMPUTERNAME kernel: [   64.112248] Code: 45 d4 89 4d b4 65 
48 8b 04 25 c4 3e 01 00 c7 45 c8 0a 00 00 00 48 89 45 b8 65 c7 05 31 24 f9 7e 
00 00 00 00 fb 66 0f 1f 44 00 00 <b8> ff ff ff ff 49 c7 c4 c0 b0 e0 81 0f bc 45 
d4 83 c0 01 89 45 

  This soft lockup happens either in kerneloops or swapper/0.

  This might have something to do with networking, because the soft
  lockups appear happen immediately before or after some network-related
  stuff. nm-applet says NetworkManager is not running, but "service
  network-manager status" says it is. "service network-manager stop"
  does not work, and I need to "kill -9" the pids for the processes.
  After starting the service after killing it, it the nm-applet's "Edit
  Connection" works for a few moments, then it won't delete any
  connections, and when closed, it won't re-open. At then end "kill -9"
  won't even work anymore (the processes get parented to init, but do
  not die). Usually, however, I never even see the login screen.

  I've been able to boot into the Wily livecd by using Windows 10 ->
  Shift-Restart -> UEFI Settings -> then booting my USB with the livecd
  (often soft lockups without going through windows).

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1530405/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to