Thank Gregory for the answer. I will be upgrade the kernel.
Do you know what kernel the CephFS is stable? Thanks. Att. --- Daniel Takatori Ohara. System Administrator - Lab. of Bioinformatics Molecular Oncology Center Instituto Sírio-Libanês de Ensino e Pesquisa Hospital Sírio-Libanês Phone: +55 11 3155-0200 (extension 1927) R: Cel. Nicolau dos Santos, 69 São Paulo-SP. 01308-060 http://www.bioinfo.mochsl.org.br On Wed, May 13, 2015 at 5:01 PM, Gregory Farnum <[email protected]> wrote: > On Wed, May 13, 2015 at 12:08 PM, Daniel Takatori Ohara > <[email protected]> wrote: > > Hi, > > > > We have a small ceph cluster with 4 OSD's and 1 MDS. > > > > I run Ubuntu 14.04 with 3.13.0-52-generic in the clients, and CentOS 6.6 > > with 2.6.32-504.16.2.el6.x86_64 in Servers. > > > > The version of Ceph is 0.94.1 > > > > Sometimes, the CephFS freeze, and the dmesg show me the follow messages : > > > > May 13 15:53:10 blade02 kernel: [93297.784094] ------------[ cut here > > ]------------ > > May 13 15:53:10 blade02 kernel: [93297.784121] WARNING: CPU: 10 PID: 299 > at > > /build/buildd/linux-3.13.0/fs/ceph/inode.c:701 > fill_inode.isra.8+0x9ed/0xa00 > > [ceph]() > > May 13 15:53:10 blade02 kernel: [93297.784129] Modules linked in: 8021q > garp > > stp mrp llc nfsv3 rpcsec_gss_krb5 nfsv4 ceph libceph libcrc32c intel_rapl > > x86_pkg_temp_thermal intel_powerclamp ipmi_devintf gpi > > May 13 15:53:10 blade02 kernel: [93297.784204] CPU: 10 PID: 299 Comm: > > kworker/10:1 Tainted: G W 3.13.0-52-generic #86-Ubuntu > > May 13 15:53:10 blade02 kernel: [93297.784207] Hardware name: Dell Inc. > > PowerEdge M520/050YHY, BIOS 2.1.3 01/20/2014 > > May 13 15:53:10 blade02 kernel: [93297.784221] Workqueue: ceph-msgr > con_work > > [libceph] > > May 13 15:53:10 blade02 kernel: [93297.784225] 0000000000000009 > > ffff880801093a28 ffffffff8172266e 0000000000000000 > > May 13 15:53:10 blade02 kernel: [93297.784233] ffff880801093a60 > > ffffffff810677fd 00000000ffffffea 0000000000000036 > > May 13 15:53:10 blade02 kernel: [93297.784239] 0000000000000000 > > 0000000000000000 ffffc9001b73f9d8 ffff880801093a70 > > May 13 15:53:10 blade02 kernel: [93297.784246] Call Trace: > > May 13 15:53:10 blade02 kernel: [93297.784257] [<ffffffff8172266e>] > > dump_stack+0x45/0x56 > > May 13 15:53:10 blade02 kernel: [93297.784264] [<ffffffff810677fd>] > > warn_slowpath_common+0x7d/0xa0 > > May 13 15:53:10 blade02 kernel: [93297.784269] [<ffffffff810678da>] > > warn_slowpath_null+0x1a/0x20 > > May 13 15:53:10 blade02 kernel: [93297.784280] [<ffffffffa046facd>] > > fill_inode.isra.8+0x9ed/0xa00 [ceph] > > May 13 15:53:10 blade02 kernel: [93297.784290] [<ffffffffa046e3cd>] ? > > ceph_alloc_inode+0x1d/0x4e0 [ceph] > > May 13 15:53:10 blade02 kernel: [93297.784302] [<ffffffffa04704cf>] > > ceph_readdir_prepopulate+0x27f/0x6d0 [ceph] > > May 13 15:53:10 blade02 kernel: [93297.784318] [<ffffffffa048a704>] > > handle_reply+0x854/0xc70 [ceph] > > May 13 15:53:10 blade02 kernel: [93297.784331] [<ffffffffa048c3f7>] > > dispatch+0xe7/0xa90 [ceph] > > May 13 15:53:10 blade02 kernel: [93297.784342] [<ffffffffa02a4a78>] ? > > ceph_tcp_recvmsg+0x48/0x60 [libceph] > > May 13 15:53:10 blade02 kernel: [93297.784354] [<ffffffffa02a7a9b>] > > try_read+0x4ab/0x10d0 [libceph] > > May 13 15:53:10 blade02 kernel: [93297.784365] [<ffffffffa02a9418>] ? > > try_write+0x9a8/0xdb0 [libceph] > > May 13 15:53:10 blade02 kernel: [93297.784373] [<ffffffff8101bc23>] ? > > native_sched_clock+0x13/0x80 > > May 13 15:53:10 blade02 kernel: [93297.784379] [<ffffffff8109d585>] ? > > sched_clock_cpu+0xb5/0x100 > > May 13 15:53:10 blade02 kernel: [93297.784390] [<ffffffffa02a98d9>] > > con_work+0xb9/0x640 [libceph] > > May 13 15:53:10 blade02 kernel: [93297.784398] [<ffffffff81083aa2>] > > process_one_work+0x182/0x450 > > May 13 15:53:10 blade02 kernel: [93297.784403] [<ffffffff81084891>] > > worker_thread+0x121/0x410 > > May 13 15:53:10 blade02 kernel: [93297.784409] [<ffffffff81084770>] ? > > rescuer_thread+0x430/0x430 > > May 13 15:53:10 blade02 kernel: [93297.784414] [<ffffffff8108b5d2>] > > kthread+0xd2/0xf0 > > May 13 15:53:10 blade02 kernel: [93297.784420] [<ffffffff8108b500>] ? > > kthread_create_on_node+0x1c0/0x1c0 > > May 13 15:53:10 blade02 kernel: [93297.784426] [<ffffffff817330cc>] > > ret_from_fork+0x7c/0xb0 > > May 13 15:53:10 blade02 kernel: [93297.784431] [<ffffffff8108b500>] ? > > kthread_create_on_node+0x1c0/0x1c0 > > May 13 15:53:10 blade02 kernel: [93297.784434] ---[ end trace > > 05d3f5ee1f31bc67 ]--- > > May 13 15:53:10 blade02 kernel: [93297.784437] ceph: fill_inode badness > on > > ffff8807f7eaa5c0 > > I don't follow the kernel stuff too closely, but the CephFS kernel > client is still improving quite rapidly and 3.13 is old at this point. > You could try upgrading to something newer. > Zheng might also know what's going on and if it's been fixed. > -Greg >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
