Public bug reported:
I noticed the following kernel error prior to expotential increase in
server load, ps listings not returning; and getting stuck on processes
only in an "S" state which seemed unresponsive to signals. (Process was
a ceph-osd if it matters)
Jul 25 00:32:58 SERVER kernel: [1529921.423169] divide error: 0000 [#1] SMP
Jul 25 00:32:58 SERVER kernel: [1529921.423196] Modules linked in: ip6table_raw
ip6table_mangle nf_conntrack_ipv6 xt_CT xt_connmark xt_mac xt_comment
xt_physdev br_n
etfilter xt_multiport xt_set ip_set_hash_net ip_set nfnetlink veth xt_CHECKSUM
iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4
nf_nat nf_conntrack_
ipv4 iptable_raw nf_defrag_ipv4 xt_conntrack ipt_REJECT nf_reject_ipv4
xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables
iptable_filter ip_tables x_ta
bles nbd ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp
vport_gre ip_gre libiscsi_tcp ip_tunnel libiscsi gre scsi_transport_iscsi
openvswitch nf_defrag_ipv6
nf_conntrack dm_crypt bonding ipmi_ssif ipmi_devintf dcdbas intel_rapl
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm dm_multipath
irqbypass sb_edac mei_me mei e
dac_core ipmi_si lpc_ich ipmi_msghandler 8250_fintek acpi_power_meter shpchp
mac_hid xfs libcrc32c btrfs xor raid6_pq bcache crct10dif_pclmul crc32_pclmul
aesni_intel aes_x86_6
4 lrw gf128mul glue_helper ablk_helper cryptd ixgbe igb vxlan ip6_udp_tunnel
dca udp_tunnel ptp pps_core megaraid_sas i2c_algo_bit mdio wmi fjes
Jul 25 00:32:58 SERVER kernel: [1529921.423919] CPU: 12 PID: 2300042 Comm:
ms_pipe_read Not tainted 4.4.0-28-generic #47~14.04.1-Ubuntu
Jul 25 00:32:58 SERVER kernel: [1529921.423942] Hardware name: Dell Inc.
PowerEdge R730xd/0H21J3, BIOS 1.0.4 08/28/2014
Jul 25 00:32:58 SERVER kernel: [1529921.423965] task: ffff881e7baba940 ti:
ffff880103fcc000 task.ti: ffff880103fcc000
Jul 25 00:32:58 SERVER kernel: [1529921.424013] RIP: 0010:[<ffffffff810aff78>]
[<ffffffff810aff78>] task_numa_find_cpu+0x238/0x700
Jul 25 00:32:58 SERVER kernel: [1529921.424087] RSP: 0000:ffff880103fcfbb0
EFLAGS: 00010257
Jul 25 00:32:58 SERVER kernel: [1529921.424126] RAX: 0000000000000000 RBX:
ffff880103fcfc50 RCX: 0000000000000000
Jul 25 00:32:58 SERVER kernel: [1529921.424191] RDX: 0000000000000000 RSI:
0000000000000001 RDI: ffff881ffed96d70
Jul 25 00:32:58 SERVER kernel: [1529921.424256] RBP: ffff880103fcfc18 R08:
0000000116cb13e1 R09: 0000000000000375
Jul 25 00:32:58 SERVER kernel: [1529921.424321] R10: 000000000001e8f9 R11:
0000000000000072 R12: ffff881e11913700
Jul 25 00:32:58 SERVER kernel: [1529921.424386] R13: 0000000000000001 R14:
0000000000000000 R15: fffffffffffffd68
Jul 25 00:32:58 SERVER kernel: [1529921.424451] FS: 00007fec1582c700(0000)
GS:ffff881ffed80000(0000) knlGS:0000000000000000
Jul 25 00:32:58 SERVER kernel: [1529921.424519] CS: 0010 DS: 0000 ES: 0000
CR0: 0000000080050033
Jul 25 00:32:58 SERVER kernel: [1529921.424559] CR2: 0000558c2fe89ff0 CR3:
0000003a7798b000 CR4: 00000000001406e0
Jul 25 00:32:58 SERVER kernel: [1529921.424624] Stack:
Jul 25 00:32:58 SERVER kernel: [1529921.424655] ffff880103fcfbb0
ffff880103fcfbb0 ffff881ffedd6d70 ffff881e7baba940
Jul 25 00:32:58 SERVER kernel: [1529921.424737] 000000000000006b
00000000000000c3 0000000000016d00 000000000000006b
Jul 25 00:32:58 SERVER kernel: [1529921.424818] ffff881e7baba940
00000000000001be ffff880103fcfc50 0000000000000192
Jul 25 00:32:58 SERVER kernel: [1529921.424899] Call Trace:
Jul 25 00:32:58 SERVER kernel: [1529921.424934] [<ffffffff810b08e0>]
task_numa_migrate+0x4a0/0x930
Jul 25 00:32:58 SERVER kernel: [1529921.424976] [<ffffffff810b0de9>]
numa_migrate_preferred+0x79/0x80
Jul 25 00:32:58 SERVER kernel: [1529921.425018] [<ffffffff810b563d>]
task_numa_fault+0x91d/0xcc0
Jul 25 00:32:58 SERVER kernel: [1529921.425062] [<ffffffff811d406e>] ?
mpol_misplaced+0x14e/0x190
Jul 25 00:32:58 SERVER kernel: [1529921.425104] [<ffffffff811b12c6>]
handle_pte_fault+0x5a6/0x1470
Jul 25 00:32:58 SERVER kernel: [1529921.425150] [<ffffffff810fb2b2>] ?
do_futex+0xa2/0x520
Jul 25 00:32:58 SERVER kernel: [1529921.425192] [<ffffffff811b3000>]
handle_mm_fault+0x250/0x540
Jul 25 00:32:58 SERVER kernel: [1529921.425236] [<ffffffff81067c0a>]
__do_page_fault+0x19a/0x430
Jul 25 00:32:58 SERVER kernel: [1529921.425279] [<ffffffff810fb7a1>] ?
SyS_futex+0x71/0x150
Jul 25 00:32:58 SERVER kernel: [1529921.425320] [<ffffffff81067ec2>]
do_page_fault+0x22/0x30
Jul 25 00:32:58 SERVER kernel: [1529921.425362] [<ffffffff817f2fb8>]
page_fault+0x28/0x30
Jul 25 00:32:58 SERVER kernel: [1529921.425402] Code: 4d b0 4c 89 f7 e8 29 d5
ff ff 48 8b 4d b0 49 8b 86 b0 00 00 00 31 d2 48 0f af 81 d8 01 00 00 49 8b 4e
78 4c 8b
73 78 48 83 c1 01 <48> f7 f1 48 8b 4b 20 49 89 c1 48 29 c1 4c 03 4b 48 4c 39 7d
d0
Jul 25 00:32:58 SERVER kernel: [1529921.425790] RIP [<ffffffff810aff78>]
task_numa_find_cpu+0x238/0x700
Jul 25 00:32:58 SERVER kernel: [1529921.425836] RSP <ffff880103fcfbb0>
Jul 25 00:32:58 SERVER kernel: [1529921.426417] ---[ end trace 6e3f67e365a57c9f
]---
Linux SERVER 4.4.0-31-generic #50~14.04.1-Ubuntu SMP Wed Jul 13 01:07:32
UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
** Affects: linux-lts-xenial (Ubuntu)
Importance: Undecided
Status: New
** Tags: canonical-bootstack
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1606098
Title:
CPU lockups divide error: 0000 [#1] SMP
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-lts-xenial/+bug/1606098/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs