Public bug reported: I noticed the following kernel error prior to expotential increase in server load, ps listings not returning; and getting stuck on processes only in an "S" state which seemed unresponsive to signals. (Process was a ceph-osd if it matters)
Jul 25 00:32:58 SERVER kernel: [1529921.423169] divide error: 0000 [#1] SMP Jul 25 00:32:58 SERVER kernel: [1529921.423196] Modules linked in: ip6table_raw ip6table_mangle nf_conntrack_ipv6 xt_CT xt_connmark xt_mac xt_comment xt_physdev br_n etfilter xt_multiport xt_set ip_set_hash_net ip_set nfnetlink veth xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ ipv4 iptable_raw nf_defrag_ipv4 xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_ta bles nbd ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp vport_gre ip_gre libiscsi_tcp ip_tunnel libiscsi gre scsi_transport_iscsi openvswitch nf_defrag_ipv6 nf_conntrack dm_crypt bonding ipmi_ssif ipmi_devintf dcdbas intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm dm_multipath irqbypass sb_edac mei_me mei e dac_core ipmi_si lpc_ich ipmi_msghandler 8250_fintek acpi_power_meter shpchp mac_hid xfs libcrc32c btrfs xor raid6_pq bcache crct10dif_pclmul crc32_pclmul aesni_intel aes_x86_6 4 lrw gf128mul glue_helper ablk_helper cryptd ixgbe igb vxlan ip6_udp_tunnel dca udp_tunnel ptp pps_core megaraid_sas i2c_algo_bit mdio wmi fjes Jul 25 00:32:58 SERVER kernel: [1529921.423919] CPU: 12 PID: 2300042 Comm: ms_pipe_read Not tainted 4.4.0-28-generic #47~14.04.1-Ubuntu Jul 25 00:32:58 SERVER kernel: [1529921.423942] Hardware name: Dell Inc. PowerEdge R730xd/0H21J3, BIOS 1.0.4 08/28/2014 Jul 25 00:32:58 SERVER kernel: [1529921.423965] task: ffff881e7baba940 ti: ffff880103fcc000 task.ti: ffff880103fcc000 Jul 25 00:32:58 SERVER kernel: [1529921.424013] RIP: 0010:[<ffffffff810aff78>] [<ffffffff810aff78>] task_numa_find_cpu+0x238/0x700 Jul 25 00:32:58 SERVER kernel: [1529921.424087] RSP: 0000:ffff880103fcfbb0 EFLAGS: 00010257 Jul 25 00:32:58 SERVER kernel: [1529921.424126] RAX: 0000000000000000 RBX: ffff880103fcfc50 RCX: 0000000000000000 Jul 25 00:32:58 SERVER kernel: [1529921.424191] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff881ffed96d70 Jul 25 00:32:58 SERVER kernel: [1529921.424256] RBP: ffff880103fcfc18 R08: 0000000116cb13e1 R09: 0000000000000375 Jul 25 00:32:58 SERVER kernel: [1529921.424321] R10: 000000000001e8f9 R11: 0000000000000072 R12: ffff881e11913700 Jul 25 00:32:58 SERVER kernel: [1529921.424386] R13: 0000000000000001 R14: 0000000000000000 R15: fffffffffffffd68 Jul 25 00:32:58 SERVER kernel: [1529921.424451] FS: 00007fec1582c700(0000) GS:ffff881ffed80000(0000) knlGS:0000000000000000 Jul 25 00:32:58 SERVER kernel: [1529921.424519] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jul 25 00:32:58 SERVER kernel: [1529921.424559] CR2: 0000558c2fe89ff0 CR3: 0000003a7798b000 CR4: 00000000001406e0 Jul 25 00:32:58 SERVER kernel: [1529921.424624] Stack: Jul 25 00:32:58 SERVER kernel: [1529921.424655] ffff880103fcfbb0 ffff880103fcfbb0 ffff881ffedd6d70 ffff881e7baba940 Jul 25 00:32:58 SERVER kernel: [1529921.424737] 000000000000006b 00000000000000c3 0000000000016d00 000000000000006b Jul 25 00:32:58 SERVER kernel: [1529921.424818] ffff881e7baba940 00000000000001be ffff880103fcfc50 0000000000000192 Jul 25 00:32:58 SERVER kernel: [1529921.424899] Call Trace: Jul 25 00:32:58 SERVER kernel: [1529921.424934] [<ffffffff810b08e0>] task_numa_migrate+0x4a0/0x930 Jul 25 00:32:58 SERVER kernel: [1529921.424976] [<ffffffff810b0de9>] numa_migrate_preferred+0x79/0x80 Jul 25 00:32:58 SERVER kernel: [1529921.425018] [<ffffffff810b563d>] task_numa_fault+0x91d/0xcc0 Jul 25 00:32:58 SERVER kernel: [1529921.425062] [<ffffffff811d406e>] ? mpol_misplaced+0x14e/0x190 Jul 25 00:32:58 SERVER kernel: [1529921.425104] [<ffffffff811b12c6>] handle_pte_fault+0x5a6/0x1470 Jul 25 00:32:58 SERVER kernel: [1529921.425150] [<ffffffff810fb2b2>] ? do_futex+0xa2/0x520 Jul 25 00:32:58 SERVER kernel: [1529921.425192] [<ffffffff811b3000>] handle_mm_fault+0x250/0x540 Jul 25 00:32:58 SERVER kernel: [1529921.425236] [<ffffffff81067c0a>] __do_page_fault+0x19a/0x430 Jul 25 00:32:58 SERVER kernel: [1529921.425279] [<ffffffff810fb7a1>] ? SyS_futex+0x71/0x150 Jul 25 00:32:58 SERVER kernel: [1529921.425320] [<ffffffff81067ec2>] do_page_fault+0x22/0x30 Jul 25 00:32:58 SERVER kernel: [1529921.425362] [<ffffffff817f2fb8>] page_fault+0x28/0x30 Jul 25 00:32:58 SERVER kernel: [1529921.425402] Code: 4d b0 4c 89 f7 e8 29 d5 ff ff 48 8b 4d b0 49 8b 86 b0 00 00 00 31 d2 48 0f af 81 d8 01 00 00 49 8b 4e 78 4c 8b 73 78 48 83 c1 01 <48> f7 f1 48 8b 4b 20 49 89 c1 48 29 c1 4c 03 4b 48 4c 39 7d d0 Jul 25 00:32:58 SERVER kernel: [1529921.425790] RIP [<ffffffff810aff78>] task_numa_find_cpu+0x238/0x700 Jul 25 00:32:58 SERVER kernel: [1529921.425836] RSP <ffff880103fcfbb0> Jul 25 00:32:58 SERVER kernel: [1529921.426417] ---[ end trace 6e3f67e365a57c9f ]--- Linux SERVER 4.4.0-31-generic #50~14.04.1-Ubuntu SMP Wed Jul 13 01:07:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux ** Affects: linux-lts-xenial (Ubuntu) Importance: Undecided Status: New ** Tags: canonical-bootstack -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1606098 Title: CPU lockups divide error: 0000 [#1] SMP To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-lts-xenial/+bug/1606098/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs