You’ll need to upgrade your kernel. It’s a terrible div by zero bug that occurs 
while trying to calculate load. You can still use “top –b –n1” instead of ps, 
but ultimately the kernel update fixed it for us. You can’t kill procs that are 
in uninterruptible wait.

Here’s the Ubuntu version: 
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1568729

Warren Wang
Walmart ✻

From: ceph-users <ceph-users-boun...@lists.ceph.com> on behalf of VELARTIS 
Philipp Dürhammer <p.duerham...@velartis.at>
Date: Thursday, December 1, 2016 at 7:19 AM
To: "'ceph-users@lists.ceph.com'" <ceph-users@lists.ceph.com>
Subject: [ceph-users] osd crash - disk hangs

Hello!

Tonight i had a osd crash. See the dump below. Also this osd is still mounted. 
Whats the cause? A bug? What to do next? I cant do a lsof or ps ax because it 
hangs.

Thank You!

Dec  1 00:31:30 ceph2 kernel: [17314369.493029] divide error: 0000 [#1] SMP
Dec  1 00:31:30 ceph2 kernel: [17314369.493062] Modules linked in: act_police 
cls_basic sch_ingress sch_htb vhost_net vhost macvtap macvlan 8021q garp mrp 
veth nfsv3 softdog ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 
ip6table_filter ip6_tables xt_mac ipt_REJECT nf_reject_ipv4 xt_NFLOG 
nfnetlink_log xt_physdev nf_conntrack_ipv4 nf_defrag_ipv4 xt_comment xt_tcpudp 
xt_addrtype xt_multiport xt_conntrack xt_set xt_mark ip_set_hash_net ip_set 
nfnetlink iptable_filter ip_tables x_tables nfsd auth_rpcgss nfs_acl nfs lockd 
grace fscache sunrpc ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr 
iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi bonding xfs libcrc32c 
ipmi_ssif mxm_wmi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm 
irqbypass crct10dif_pclmul crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul 
glue_helper ablk_helper cryptd snd_pcm snd_timer snd soundcore pcspkr 
input_leds sb_edac shpchp edac_core mei_me ioatdma mei lpc_ich i2c_i801 ipmi_si 
8250_fintek wmi ipmi_msghandler mac_hid nf_conntrack_ftp nf_conntrack autofs4 
ses enclosure hid_generic usbmouse usbkbd usbhid hid ixgbe(O) vxlan 
ip6_udp_tunnel megaraid_sas udp_tunnel isci ahci libahci libsas igb(O) 
scsi_transport_sas dca ptp pps_core fjes
Dec  1 00:31:30 ceph2 kernel: [17314369.493708] CPU: 1 PID: 17291 Comm: 
ceph-osd Tainted: G           O    4.4.8-1-pve #1
Dec  1 00:31:30 ceph2 kernel: [17314369.493754] Hardware name: Thomas-Krenn.AG 
X9DR3-F/X9DR3-F, BIOS 3.0a 07/31/2013
Dec  1 00:31:30 ceph2 kernel: [17314369.493799] task: ffff881f6ff05280 ti: 
ffff880037c4c000 task.ti: ffff880037c4c000
Dec  1 00:31:30 ceph2 kernel: [17314369.493843] RIP: 0010:[<ffffffff810b58fd>]  
[<ffffffff810b58fd>] task_numa_find_cpu+0x23d/0x710
Dec  1 00:31:30 ceph2 kernel: [17314369.493893] RSP: 0000:ffff880037c4fbd8  
EFLAGS: 00010257
Dec  1 00:31:30 ceph2 kernel: [17314369.493919] RAX: 0000000000000000 RBX: 
ffff880037c4fc80 RCX: 0000000000000000
Dec  1 00:31:30 ceph2 kernel: [17314369.493962] RDX: 0000000000000000 RSI: 
ffff88103fa40000 RDI: ffff881033f50c00
Dec  1 00:31:30 ceph2 kernel: [17314369.494006] RBP: ffff880037c4fc48 R08: 
0000000202046ea8 R09: 000000000000036b
Dec  1 00:31:30 ceph2 kernel: [17314369.494049] R10: 000000000000007c R11: 
0000000000000540 R12: ffff88064fbd0000
Dec  1 00:31:30 ceph2 kernel: [17314369.494093] R13: 0000000000000250 R14: 
0000000000000540 R15: 0000000000000009
Dec  1 00:31:30 ceph2 kernel: [17314369.494136] FS:  00007ff17dd6c700(0000) 
GS:ffff88103fa40000(0000) knlGS:0000000000000000
Dec  1 00:31:30 ceph2 kernel: [17314369.494182] CS:  0010 DS: 0000 ES: 0000 
CR0: 0000000080050033
Dec  1 00:31:30 ceph2 kernel: [17314369.494209] CR2: 00007ff17dd6aff8 CR3: 
0000001025e4b000 CR4: 00000000001426e0
Dec  1 00:31:30 ceph2 kernel: [17314369.494252] Stack:
Dec  1 00:31:30 ceph2 kernel: [17314369.494273]  ffff880037c4fbe8 
ffffffff81038219 000000000000003f 0000000000017180
Dec  1 00:31:30 ceph2 kernel: [17314369.494323]  ffff881f6ff05280 
0000000000017180 0000000000000251 ffffffffffffffe7
Dec  1 00:31:30 ceph2 kernel: [17314369.494374]  0000000000000251 
ffff881f6ff05280 ffff880037c4fc80 00000000000000cb
Dec  1 00:31:30 ceph2 kernel: [17314369.494424] Call Trace:
Dec  1 00:31:30 ceph2 kernel: [17314369.494449]  [<ffffffff81038219>] ? 
sched_clock+0x9/0x10
Dec  1 00:31:30 ceph2 kernel: [17314369.494476]  [<ffffffff810b62b6>] 
task_numa_migrate+0x4e6/0xa00
Dec  1 00:31:30 ceph2 kernel: [17314369.494506]  [<ffffffff813fea6c>] ? 
copy_to_iter+0x7c/0x260
Dec  1 00:31:30 ceph2 kernel: [17314369.494534]  [<ffffffff810b6849>] 
numa_migrate_preferred+0x79/0x80
Dec  1 00:31:30 ceph2 kernel: [17314369.494563]  [<ffffffff810bb348>] 
task_numa_fault+0x848/0xd10
Dec  1 00:31:30 ceph2 kernel: [17314369.494591]  [<ffffffff810ba969>] ? 
should_numa_migrate_memory+0x59/0x130
Dec  1 00:31:30 ceph2 kernel: [17314369.494623]  [<ffffffff811c0314>] 
handle_mm_fault+0xc64/0x1a20
Dec  1 00:31:30 ceph2 kernel: [17314369.494654]  [<ffffffff8170c3f4>] ? 
SYSC_recvfrom+0x144/0x160
Dec  1 00:31:30 ceph2 kernel: [17314369.494684]  [<ffffffff8106b4ed>] 
__do_page_fault+0x19d/0x410
Dec  1 00:31:30 ceph2 kernel: [17314369.494713]  [<ffffffff81003360>] ? 
exit_to_usermode_loop+0xb0/0xd0
Dec  1 00:31:30 ceph2 kernel: [17314369.494742]  [<ffffffff8106b782>] 
do_page_fault+0x22/0x30
Dec  1 00:31:30 ceph2 kernel: [17314369.494771]  [<ffffffff8184ab38>] 
page_fault+0x28/0x30
Dec  1 00:31:30 ceph2 kernel: [17314369.494797] Code: 4d b0 4c 89 ef e8 b4 d0 
ff ff 48 8b 4d b0 49 8b 85 b0 00 00 00 31 d2 48 0f af 81 d8 01 00 00 49 8b 4d 
78 4c 8b 6b 78 48 83 c1 01 <48> f7 f1 48 8b 4b 20 49 89 c0 48 29 c1 4c 03 43 48 
4c 39 75 d0
Dec  1 00:31:30 ceph2 kernel: [17314369.495005] RIP  [<ffffffff810b58fd>] 
task_numa_find_cpu+0x23d/0x710
Dec  1 00:31:30 ceph2 kernel: [17314369.495035]  RSP <ffff880037c4fbd8>
Dec  1 00:31:30 ceph2 kernel: [17314369.495347] ---[ end trace 7106c9a72840cc7d 
]---


This email and any files transmitted with it are confidential and intended 
solely for the individual or entity to whom they are addressed. If you have 
received this email in error destroy it immediately. *** Walmart Confidential 
***
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to