Public bug reported:

Release: Ubuntu 14.04.5 LTS
Kernel: Linux 4.4.0-67-generic #88~14.04.1-Ubuntu SMP
Filesystems: ext4 on Hardware RAID 6

We regularly run a backup script, that mainly utilities rsync and mv.
When there is a lot of change, the server sometimes freezes and can only
be recovered by power cycling. I thought it was a hardware problem, but
we have this problem now on 2 out of 18 identical machines. They have
different BIOS versions. So probably, it's related to the amount of
data. During the process I see high load by the processes rsync and
chmod.

Kernel messages:
Apr  2 01:09:58 server kernel: [483707.688686] NMI watchdog: BUG: soft lockup - 
CPU#7 stuck for 22s! [kswapd0:83]
Apr  2 01:09:58 server kernel: [483707.688716] Modules linked in: drbg 
ansi_cprng ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat 
ebtables x_tables 8021q garp mrp bridge stp llc dm_crypt intel_rapl 
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm ipmi_ssif 
irqbypass ipmi_devintf crct10dif_pclmul crc32_pclmul ghash_clmulni_intel 
aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd sb_edac 
dcdbas edac_core acpi_power_meter shpchp ipmi_si mei_me input_leds lpc_ich 
ipmi_msghandler mei 8250_fintek mac_hid parport_pc ppdev lp parport igb dca ptp 
hid_generic usbhid hid ahci pps_core libahci i2c_algo_bit megaraid_sas wmi fjes
Apr  2 01:09:58 server kernel: [483707.688718] CPU: 7 PID: 83 Comm: kswapd0 
Tainted: G             L  4.4.0-67-generic #88~14.04.1-Ubuntu
Apr  2 01:09:58 server kernel: [483707.688719] Hardware name: Dell Inc. 
PowerEdge T630, BIOS 1.5.4 10/04/2015
Apr  2 01:09:58 server kernel: [483707.688720] task: ffff881034ac6200 ti: 
ffff88102da44000 task.ti: ffff88102da44000
Apr  2 01:09:58 server kernel: [483707.688722] RIP: 0010:[<ffffffff810c671a>]  
[<ffffffff810c671a>] native_queued_spin_lock_slowpath+0x10a/0x170
Apr  2 01:09:58 server kernel: [483707.688723] RSP: 0018:ffff88102da47c58  
EFLAGS: 00000246
Apr  2 01:09:58 server kernel: [483707.688724] RAX: 0000000000000000 RBX: 
000000000000037a RCX: ffff88103d3d7940
Apr  2 01:09:58 server kernel: [483707.688725] RDX: ffff88103d417940 RSI: 
0000000000200000 RDI: ffffffff821dc7e0
Apr  2 01:09:58 server kernel: [483707.688725] RBP: ffff88102da47c58 R08: 
0000000000000101 R09: 28f5c28f5c28f5c3
Apr  2 01:09:58 server kernel: [483707.688726] R10: 0000000000000000 R11: 
ffff88102da47a58 R12: 0000000000000080
Apr  2 01:09:58 server kernel: [483707.688727] R13: 0000000000000000 R14: 
ffffffff81e8ae40 R15: 0000000000007ace
Apr  2 01:09:58 server kernel: [483707.688728] FS:  0000000000000000(0000) 
GS:ffff88103d3c0000(0000) knlGS:0000000000000000
Apr  2 01:09:58 server kernel: [483707.688728] CS:  0010 DS: 0000 ES: 0000 CR0: 
0000000080050033
Apr  2 01:09:58 server kernel: [483707.688729] CR2: 00007ff3c624c0f2 CR3: 
0000000001e0c000 CR4: 00000000001426e0
Apr  2 01:09:58 server kernel: [483707.688730] Stack:
Apr  2 01:09:58 server kernel: [483707.688731]  ffff88102da47c68 
ffffffff81183477 ffff88102da47c78 ffffffff81806af0
Apr  2 01:09:58 server kernel: [483707.688733]  ffff88102da47c88 
ffffffff8125dfd5 ffff88102da47d60 ffffffff8119601a
Apr  2 01:09:58 server kernel: [483707.688734]  0000000000000000 
0000000000000000 ffff880da9fdf340 0000000000e86866
Apr  2 01:09:58 server kernel: [483707.688735] Call Trace:
Apr  2 01:09:58 server kernel: [483707.688737]  [<ffffffff81183477>] 
queued_spin_lock_slowpath+0xb/0xf
Apr  2 01:09:58 server kernel: [483707.688739]  [<ffffffff81806af0>] 
_raw_spin_lock+0x20/0x30
Apr  2 01:09:58 server kernel: [483707.688740]  [<ffffffff8125dfd5>] 
mb_cache_shrink_count+0x15/0xb0
Apr  2 01:09:58 server kernel: [483707.688742]  [<ffffffff8119601a>] 
shrink_slab.part.40+0x10a/0x3f0
Apr  2 01:09:58 server kernel: [483707.688744]  [<ffffffff8119a6f7>] 
shrink_zone+0x2a7/0x2c0
Apr  2 01:09:58 server kernel: [483707.688746]  [<ffffffff8119b6c7>] 
kswapd+0x4c7/0x970
Apr  2 01:09:58 server kernel: [483707.688749]  [<ffffffff8119b200>] ? 
mem_cgroup_shrink_node_zone+0x190/0x190
Apr  2 01:09:58 server kernel: [483707.688750]  [<ffffffff8109cd19>] 
kthread+0xc9/0xe0
Apr  2 01:09:58 server kernel: [483707.688752]  [<ffffffff8109cc50>] ? 
kthread_park+0x60/0x60
Apr  2 01:09:58 server kernel: [483707.688753]  [<ffffffff8180724f>] 
ret_from_fork+0x3f/0x70
Apr  2 01:09:58 server kernel: [483707.688754]  [<ffffffff8109cc50>] ? 
kthread_park+0x60/0x60
Apr  2 01:09:58 server kernel: [483707.688772] Code: c2 c1 e8 12 48 c1 ea 0c 83 
e8 01 83 e2 30 48 98 48 81 c2 40 79 01 00 48 03 14 c5 00 99 f3 81 48 89 0a 8b 
41 08 85 c0 75 0d f3 90 <8b> 41 08 85 c0 74 f7 eb 02 f3 90 8b 17 66 85 d2 75 f7 
39 f2 66
Apr  2 01:09:58 server kernel: [483707.698419] Modules linked in: drbg 
ansi_cprng ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat 
ebtables x_tables 8021q garp mrp bridge stp llc dm_crypt intel_rapl 
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm ipmi_ssif 
irqbypass ipmi_devintf crct10dif_pclmul crc32_pclmul ghash_clmulni_intel 
aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd sb_edac 
dcdbas edac_core acpi_power_meter shpchp ipmi_si mei_me input_leds lpc_ich 
ipmi_msghandler mei 8250_fintek mac_hid parport_pc ppdev lp parport igb dca ptp 
hid_generic usbhid hid ahci pps_core libahci i2c_algo_bit megaraid_sas wmi fjes
Apr  2 01:09:58 server kernel: [483707.698441] CPU: 3 PID: 3119 Comm: freshclam 
Tainted: G             L  4.4.
0-67-generic #88~14.04.1-Ubuntu
Apr  2 01:09:58 server kernel: [483707.698441] Hardware name: Dell Inc. 
PowerEdge T630, BIOS 1.5.4 10/0
4/2015
Apr  2 01:09:58 server kernel: [483707.698443] task: ffff88102b9b3800 ti: 
ffff88102ef28000 task.ti: ffff88102e
f28000
Apr  2 01:09:58 server kernel: [483707.698444] RIP: 0010:[<ffffffff810c671d>]  
[<ffffffff810c671d>] native_que
ued_spin_lock_slowpath+0x10d/0x170
Apr  2 01:09:58 server kernel: [483707.698447] RSP: 0018:ffff88102ef2b7c0  
EFLAGS: 00000246
Apr  2 01:09:58 server kernel: [483707.698448] RAX: 0000000000000000 RBX: 
000000000000037a RCX: ffff88103d2d79
40
Apr  2 01:09:58 server kernel: [483707.698448] RDX: ffff88103d3d7940 RSI: 
0000000000100000 RDI: ffffffff821dc7
e0
Apr  2 01:09:58 server kernel: [483707.698449] RBP: ffff88102ef2b7c0 R08: 
0000000000000101 R09: 28f5c28f5c28f5
c3
Apr  2 01:09:58 server kernel: [483707.698450] R10: 0000000000000000 R11: 
ffff88102ef2b5c8 R12: 00000000000000
80
Apr  2 01:09:58 server kernel: [483707.698451] R13: 0000000000000000 R14: 
ffffffff81e8ae40 R15: 0000000000007a
ce
Apr  2 01:09:58 server kernel: [483707.698452] FS:  00007fe59bc02780(0000) 
GS:ffff88103d2c0000(0000) knlGS:000
0000000000000
Apr  2 01:09:58 server kernel: [483707.698453] CS:  0010 DS: 0000 ES: 0000 CR0: 
0000000080050033
Apr  2 01:09:58 server kernel: [483707.698454] CR2: 00007fe59bc13000 CR3: 
000000102c83f000 CR4: 00000000001426
e0
Apr  2 01:09:58 server kernel: [483707.698455] Stack:
Apr  2 01:09:58 server kernel: [483707.698456]  ffff88102ef2b7d0 
ffffffff81183477 ffff88102ef2b7e0 ffffffff818
06af0
Apr  2 01:09:58 server kernel: [483707.698457]  ffff88102ef2b7f0 
ffffffff8125dfd5 ffff88102ef2b8c8 ffffffff811
9601a
Apr  2 01:09:58 server kernel: [483707.698459]  0000000000000003 
0000000000000001 0000000000000000 0000000000e
876d8
Apr  2 01:09:58 server kernel: [483707.698461] Call Trace:
Apr  2 01:09:58 server kernel: [483707.698463]  [<ffffffff81183477>] 
queued_spin_lock_slowpath+0xb/0xf
Apr  2 01:09:58 server kernel: [483707.698465]  [<ffffffff81806af0>] 
_raw_spin_lock+0x20/0x30
Apr  2 01:09:58 server kernel: [483707.698467]  [<ffffffff8125dfd5>] 
mb_cache_shrink_count+0x15/0xb0
Apr  2 01:09:58 server kernel: [483707.698469]  [<ffffffff8119601a>] 
shrink_slab.part.40+0x10a/0x3f0
Apr  2 01:09:58 server kernel: [483707.698471]  [<ffffffff8119a6f7>] 
shrink_zone+0x2a7/0x2c0
Apr  2 01:09:58 server kernel: [483707.698473]  [<ffffffff8119aa86>] 
do_try_to_free_pages+0x166/0x3d0
Apr  2 01:09:58 server kernel: [483707.698475]  [<ffffffff81197dfd>] ? 
throttle_direct_reclaim+0x8d/0x230
Apr  2 01:09:58 server kernel: [483707.698477]  [<ffffffff8119ada5>] 
try_to_free_pages+0xb5/0x170
Apr  2 01:09:58 server kernel: [483707.698479]  [<ffffffff811fbb6e>] 
__alloc_pages_slowpath.constprop.87+0x323/0x78c
Apr  2 01:09:58 server kernel: [483707.698482]  [<ffffffff8118e3c7>] 
__alloc_pages_nodemask+0x237/0x240
Apr  2 01:09:58 server kernel: [483707.698483]  [<ffffffff811d4298>] 
alloc_pages_current+0x88/0x120
Apr  2 01:09:58 server kernel: [483707.698485]  [<ffffffff8118562e>] 
__page_cache_alloc+0xae/0xc0
Apr  2 01:09:58 server kernel: [483707.698487]  [<ffffffff81186029>] 
pagecache_get_page+0x59/0x1c0
Apr  2 01:09:58 server kernel: [483707.698488]  [<ffffffff811861b6>] 
grab_cache_page_write_begin+0x26/0x40
Apr  2 01:09:58 server kernel: [483707.698490]  [<ffffffff8128e6d1>] 
ext4_da_write_begin+0xa1/0x330
Apr  2 01:09:58 server kernel: [483707.698492]  [<ffffffff811851f0>] 
generic_perform_write+0xc0/0x1a0
Apr  2 01:09:58 server kernel: [483707.698494]  [<ffffffff8121a89b>] ? 
file_update_time+0x3b/0xf0
Apr  2 01:09:58 server kernel: [483707.698496]  [<ffffffff811873a7>] 
__generic_file_write_iter+0x197/0x1e0
Apr  2 01:09:58 server kernel: [483707.698498]  [<ffffffff812832e6>] 
ext4_file_write_iter+0xf6/0x360
Apr  2 01:09:58 server kernel: [483707.698500]  [<ffffffff812008f8>] 
new_sync_write+0x88/0xb0
Apr  2 01:09:58 server kernel: [483707.698501]  [<ffffffff81200947>] 
__vfs_write+0x27/0x40
Apr  2 01:09:58 server kernel: [483707.698503]  [<ffffffff81200f52>] 
vfs_write+0xa2/0x1a0
Apr  2 01:09:58 server kernel: [483707.698504]  [<ffffffff81201c76>] 
SyS_write+0x46/0xa0
Apr  2 01:09:58 server kernel: [483707.698506]  [<ffffffff81806eb6>] 
entry_SYSCALL_64_fastpath+0x16/0x75
Apr  2 01:09:58 server kernel: [483707.698507] Code: 12 48 c1 ea 0c 83 e8 01 83 
e2 30 48 98 48 81 c2 40 79 01 00 48 03 14 c5 00 99 f3 81 48 89 0a 8b 41 08 85 
c0 75 0d f3 90 8b 41 08 <85> c0 74 f7 eb 02 f3 90 8b 17 66 85 d2 75 f7 39 f2 66 
90 75 0f

The problem exists for a while now. None of the latest kernel updates
helped. Can you please advice me what do do? Thank you!

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-4.4.0-67-generic 4.4.0-67.88~14.04.1
ProcVersionSignature: Ubuntu 4.4.0-67.88~14.04.1-generic 4.4.49
Uname: Linux 4.4.0-67-generic x86_64
ApportVersion: 2.14.1-0ubuntu3.23
Architecture: amd64
Date: Tue Apr  4 12:38:13 2017
InstallationDate: Installed on 2016-02-22 (406 days ago)
InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Release amd64 
(20140416.2)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: linux-lts-xenial
UpgradeStatus: No upgrade log present (probably fresh install)

** Affects: linux-lts-xenial (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: amd64 apport-bug trusty

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1679625

Title:
  Server crashes on soft lockup

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-lts-xenial/+bug/1679625/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to