On a system running 4.4.0-28-generic I get something similar-looking: foonode kernel: [595908.569972] divide error: 0000 [#1] SMP foonode kernel: [595908.571257] Modules linked in: ip6table_raw ip6table_mangle nf_conntrack_ipv6 xt_CT xt_connmark xt_mac xt_comment xt_physdev br_netfilter xt_set xt_multiport ip_set_hash_net ip_set nfnetlink veth iptable_raw xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables nbd ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi vport_gre ip_gre ip_tunnel gre openvswitch nf_defrag_ipv6 nf_conntrack dm_crypt bonding ipmi_devintf intel_rapl x86_pkg_temp_thermal intel_powerclamp ipmi_ssif coretemp kvm_intel dcdbas kvm irqbypass dm_multipath joydev input_leds sb_edac mei_me edac_core shpchp mei lpc_ich ipmi_si 8250_fintek ipmi_msghandler mac_hid acpi_power_meter xfs libcrc32c btrfs xor raid6_pq bcache cr ct10dif_pclmul crc32_pclmul ixgbe hid_generic igb vxlan ip6_udp_tunnel usbhid aesni_intel udp_tunnel dca aes_x86_64 ptp lrw gf128mul glue_helper ablk_helper hid pps_core cryptd i2c_algo_bit mdio megaraid_sas wmi fjes foonode kernel: [595908.607022] CPU: 30 PID: 3173122 Comm: ms_pipe_write Not tainted 4.4.0-28-generic #47~14.04.1-Ubuntu foonode kernel: [595908.609874] Hardware name: Dell Inc. PowerEdge R730xd/0H21J3, BIOS 1.0.4 08/28/2014 foonode kernel: [595908.612265] task: ffff881efbe78000 ti: ffff8816f3410000 task.ti: ffff8816f3410000 foonode kernel: [595908.614601] RIP: 0010:[<ffffffff810aff78>] [<ffffffff810aff78>] task_numa_find_cpu+0x238/0x700 foonode kernel: [595908.617346] RSP: 0000:ffff8816f3413bb0 EFLAGS: 00010257 foonode kernel: [595908.619007] RAX: 0000000000000000 RBX: ffff8816f3413c50 RCX: 0000000000000000 foonode kernel: [595908.733387] RDX: 0000000000000000 RSI: ffff881ffefc0000 RDI: ffff881ffefd6d70 foonode kernel: [595908.852154] RBP: ffff8816f3413c18 R08: 0000000108dfe960 R09: 0000000000000042 foonode kernel: [595908.973271] R10: 000000000000001e R11: 00000000000001dc R12: ffff883e649ea940 foonode kernel: [595909.093668] R13: 0000000000000019 R14: 000000000000011f R15: 00000000000001bf foonode kernel: [595909.213849] FS: 00007fd7eb7f6700(0000) GS:ffff881ffefc0000(0000) knlGS:0000000000000000 foonode kernel: [595909.334056] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 foonode kernel: [595909.394132] CR2: 0000562170fdde00 CR3: 00000025c0ba8000 CR4: 00000000001406e0 foonode kernel: [595909.512605] Stack: foonode kernel: [595909.569642] 000000304e62e962 0000000000000054 00000000ffffffff ffff881efbe78000 foonode kernel: [595909.684440] 0000000000000039 0000000000000181 0000000000016d00 0000000000000039 foonode kernel: [595909.798807] ffff881efbe78000 00000000000001d7 ffff8816f3413c50 0000000000000133 foonode kernel: [595909.913067] Call Trace: foonode kernel: [595909.968501] [<ffffffff810b08e0>] task_numa_migrate+0x4a0/0x930 foonode kernel: [595910.024588] [<ffffffff816d5217>] ? release_sock+0x117/0x160 foonode kernel: [595910.079692] [<ffffffff810b0de9>] numa_migrate_preferred+0x79/0x80 foonode kernel: [595910.134123] [<ffffffff810b563d>] task_numa_fault+0x91d/0xcc0 foonode kernel: [595910.187489] [<ffffffff811d406e>] ? mpol_misplaced+0x14e/0x190 foonode kernel: [595910.239834] [<ffffffff811b12c6>] handle_pte_fault+0x5a6/0x1470 foonode kernel: [595910.291547] [<ffffffff810f8d01>] ? futex_wake+0x81/0x150 foonode kernel: [595910.342550] [<ffffffff810fb304>] ? do_futex+0xf4/0x520 foonode kernel: [595910.392281] [<ffffffff811b3000>] handle_mm_fault+0x250/0x540 foonode kernel: [595910.441068] [<ffffffff81067c0a>] __do_page_fault+0x19a/0x430 foonode kernel: [595910.488511] [<ffffffff81067ec2>] do_page_fault+0x22/0x30 foonode kernel: [595910.534771] [<ffffffff817f2fb8>] page_fault+0x28/0x30 foonode kernel: [595910.579823] Code: 4d b0 4c 89 f7 e8 29 d5 ff ff 48 8b 4d b0 49 8b 86 b0 00 00 00 31 d2 48 0f af 81 d8 01 00 00 49 8b 4e 78 4c 8b 73 78 48 83 c1 01 <48> f7 f1 48 8b 4b 20 49 89 c1 48 29 c1 4c 03 4b 48 4c 39 7d d0 foonode kernel: [595910.716552] RIP [<ffffffff810aff78>] task_numa_find_cpu+0x238/0x700 foonode kernel: [595910.760983] RSP <ffff8816f3413bb0> foonode kernel: [595910.869336] ---[ end trace a318cca29e8da7ca ]---
** Tags added: canonical-bootstack -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1568729 Title: divide error: 0000 [#1] SMP in task_numa_migrate - handle_mm_fault Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Bug description: While running qemu 2.5 on a trusty host running 4.4.0-15.31~14.04.1 the host system has crashed (load > 200) 3 times in the last 3 days. Always with this stack trace: Apr 9 19:01:09 cnode9.0 kernel: [197071.195577] divide error: 0000 [#1] SMP Apr 9 19:01:09 cnode9.0 kernel: [197071.195633] Modules linked in: vhost_net vhost macvtap macvlan arc4 md4 nls_utf8 ci fs nfnetlink_queue nfnetlink xt_CHECKSUM xt_nat iptable_nat nf_nat_ipv4 xt_NFQUEUE xt_CLASSIFY ip6table_mangle sch_sfq sch_htb veth dccp_diag dccp tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag ebtable_filter ebtables nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_fil ter ip6_tables iptable_mangle xt_CT iptable_raw xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack iptable_filter ip_tables x_tables dum my bridge stp llc ipmi_ssif ipmi_devintf intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm dcdbas irqbypass crct10dif_p clmul crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd joydev input_leds nf_nat_ftp sb_edac nf_conntrack_ftp e dac_core cdc_ether nf_nat_pptp usbnet nf_conntrack_pptp mii nf_nat_proto_gre lpc_ich nf_nat_sip ioatdma nf_nat nf_conntrack_sip nfsd ipmi_si 8250_fintek nf_conntrack_proto_gre ipmi_msghandler acpi_pad wmi shpchp nf_conntrack acpi_power_meter mac_hid auth_rpcgss nfs_acl bonding nfs lp lockd parport grace sunrpc fscache tcp_htcp xfs btrfs hid_generic usbhid hid raid10 raid456 async_raid6_recov async_memcpy async_pq async_ xor async_tx xor ixgbe raid6_pq libcrc32c igb vxlan raid1 i2c_algo_bit ip6_udp_tunnel dca udp_tunnel ahci raid0 ptp libahci megaraid_sas mult ipath pps_core mdio linear fjes Apr 9 19:01:09 cnode9.0 kernel: [197071.197014] CPU: 13 PID: 3147726 Comm: ceph-osd Not tainted 4.4.0-15-generic #31~14 .04.1-Ubuntu Apr 9 19:01:09 cnode9.0 kernel: [197071.197085] Hardware name: Dell Inc. PowerEdge R720/0XH7F2, BIOS 2.5.2 01/28/2015 Apr 9 19:01:09 cnode9.0 kernel: [197071.197154] task: ffff88252be1ee00 ti: ffff8824fc0d4000 task.ti: ffff8824fc0d4000 Apr 9 19:01:09 cnode9.0 kernel: [197071.197221] RIP: 0010:[<ffffffff810afec8>] [<ffffffff810afec8>] task_numa_find_cpu+0x238/0x700 Apr 9 19:01:09 cnode9.0 kernel: [197071.197300] RSP: 0000:ffff8824fc0d7ba8 EFLAGS: 00010257 Apr 9 19:01:09 cnode9.0 kernel: [197071.197340] RAX: 0000000000000000 RBX: ffff8824fc0d7c48 RCX: 0000000000000000 Apr 9 19:01:09 cnode9.0 kernel: [197071.197406] RDX: 0000000000000000 RSI: ffff88479f180000 RDI: ffff884782a47600 Apr 9 19:01:09 cnode9.0 kernel: [197071.197473] RBP: ffff8824fc0d7c10 R08: 0000000102eea157 R09: 00000000000001a8 Apr 9 19:01:09 cnode9.0 kernel: [197071.197540] R10: 000000000002404b R11: 000000000000023f R12: ffff882380930000 Apr 9 19:01:09 cnode9.0 kernel: [197071.197606] R13: 0000000000000008 R14: 000000000000008c R15: 0000000000000124 Apr 9 19:01:09 cnode9.0 kernel: [197071.197673] FS: 00007f19aab5b700(0000) GS:ffff88479f180000(0000) knlGS:0000000000000000 Apr 9 19:01:09 cnode9.0 kernel: [197071.197741] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 9 19:01:09 cnode9.0 kernel: [197071.197782] CR2: 0000000025469600 CR3: 00000023846bc000 CR4: 00000000000426e0 Apr 9 19:01:09 cnode9.0 kernel: [197071.197848] Stack: Apr 9 19:01:09 cnode9.0 kernel: [197071.197880] ffffffff817425fb ffff8829af3e9e00 00000000000000f6 ffff88252be1ee00 Apr 9 19:01:09 cnode9.0 kernel: [197071.197965] 000000000000008d 0000000000000225 0000000000016d40 000000000000008d Apr 9 19:01:09 cnode9.0 kernel: [197071.198047] ffff88252be1ee00 00000000000001ad ffff8824fc0d7c48 00000000000000e1 Apr 9 19:01:09 cnode9.0 kernel: [197071.198132] Call Trace: Apr 9 19:01:09 cnode9.0 kernel: [197071.198172] [<ffffffff817425fb>] ? tcp_schedule_loss_probe+0x12b/0x1b0 Apr 9 19:01:09 cnode9.0 kernel: [197071.198219] [<ffffffff810b0830>] task_numa_migrate+0x4a0/0x930 Apr 9 19:01:09 cnode9.0 kernel: [197071.198264] [<ffffffff816d2957>] ? release_sock+0x117/0x160 Apr 9 19:01:09 cnode9.0 kernel: [197071.198306] [<ffffffff810b0d39>] numa_migrate_preferred+0x79/0x80 Apr 9 19:01:09 cnode9.0 kernel: [197071.198350] [<ffffffff810b557d>] task_numa_fault+0x91d/0xcc0 Apr 9 19:01:09 cnode9.0 kernel: [197071.198395] [<ffffffff811d35ae>] ? mpol_misplaced+0x14e/0x190 Apr 9 19:01:09 cnode9.0 kernel: [197071.198439] [<ffffffff811b06b8>] handle_pte_fault+0x5a8/0x14c0 Apr 9 19:01:09 cnode9.0 kernel: [197071.198485] [<ffffffff810f8531>] ? futex_wake+0x81/0x150 Apr 9 19:01:09 cnode9.0 kernel: [197071.198526] [<ffffffff810b0de4>] ? set_next_entity+0xa4/0x700 Apr 9 19:01:09 cnode9.0 kernel: [197071.198569] [<ffffffff810fab44>] ? do_futex+0xf4/0x4d0 Apr 9 19:01:09 cnode9.0 kernel: [197071.198610] [<ffffffff811b2440>] handle_mm_fault+0x250/0x540 Apr 9 19:01:09 cnode9.0 kernel: [197071.198654] [<ffffffff81067d19>] __do_page_fault+0x199/0x430 Apr 9 19:01:09 cnode9.0 kernel: [197071.198696] [<ffffffff81067fd2>] do_page_fault+0x22/0x30 Apr 9 19:01:09 cnode9.0 kernel: [197071.198740] [<ffffffff817ef878>] page_fault+0x28/0x30 Apr 9 19:01:09 cnode9.0 kernel: [197071.198775] Code: 4d b0 4c 89 f7 e8 29 d5 ff ff 48 8b 4d b0 49 8b 86 b0 00 00 00 31 d2 48 0f af 81 d8 01 00 00 49 8b 4e 78 4c 8b 73 78 48 83 c1 01 <48> f7 f1 48 8b 4b 20 49 89 c1 48 29 c1 4c 03 4b 48 4c 39 7d d0 Apr 9 19:01:09 cnode9.0 kernel: [197071.199217] RIP [<ffffffff810afec8>] task_numa_find_cpu+0x238/0x700 Apr 9 19:01:09 cnode9.0 kernel: [197071.199264] RSP <ffff8824fc0d7ba8> Apr 9 19:01:09 cnode9.0 kernel: [197071.199900] ---[ end trace e938a840610a79f7 ]--- This is appears to be the same bug as reported upstream in http://lkml.iu.edu/hypermail/linux/kernel/1603.2/01659.html According to this thread the issue is: 27: 48 83 c1 01 add $0x1,%rcx 2b:* 48 f7 f1 div %rcx <-- trapping instruction This suggests the CONFIG_FAIR_GROUP_SCHED version of task_h_load: update_cfs_rq_h_load(cfs_rq); return div64_ul(p->se.avg.load_avg * cfs_rq->h_load, cfs_rq_load_avg(cfs_rq) + 1); So the load avg is -1, thus after adding 1 we get division by 0 The fix of the LKML reporter was to include the patches to kernel/sched/fair.c up to 4.5 A specific patch was not identified. Please backport these patches for Xenial and lts-xenial kernel in trusty. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1568729/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : [email protected] Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp

