Thank Gregory for the answer.

I will be upgrade the kernel.

Do you know what kernel the CephFS is stable?

Thanks.


Att.

---
Daniel Takatori Ohara.
System Administrator - Lab. of Bioinformatics
Molecular Oncology Center
Instituto Sírio-Libanês de Ensino e Pesquisa
Hospital Sírio-Libanês
Phone: +55 11 3155-0200 (extension 1927)
R: Cel. Nicolau dos Santos, 69
São Paulo-SP. 01308-060
http://www.bioinfo.mochsl.org.br


On Wed, May 13, 2015 at 5:01 PM, Gregory Farnum <[email protected]> wrote:

> On Wed, May 13, 2015 at 12:08 PM, Daniel Takatori Ohara
> <[email protected]> wrote:
> > Hi,
> >
> > We have a small ceph cluster with 4 OSD's and 1 MDS.
> >
> > I run Ubuntu 14.04 with 3.13.0-52-generic in the clients, and CentOS 6.6
> > with 2.6.32-504.16.2.el6.x86_64 in Servers.
> >
> > The version of Ceph is 0.94.1
> >
> > Sometimes, the CephFS freeze, and the dmesg show me the follow messages :
> >
> > May 13 15:53:10 blade02 kernel: [93297.784094] ------------[ cut here
> > ]------------
> > May 13 15:53:10 blade02 kernel: [93297.784121] WARNING: CPU: 10 PID: 299
> at
> > /build/buildd/linux-3.13.0/fs/ceph/inode.c:701
> fill_inode.isra.8+0x9ed/0xa00
> > [ceph]()
> > May 13 15:53:10 blade02 kernel: [93297.784129] Modules linked in: 8021q
> garp
> > stp mrp llc nfsv3 rpcsec_gss_krb5 nfsv4 ceph libceph libcrc32c intel_rapl
> > x86_pkg_temp_thermal intel_powerclamp ipmi_devintf gpi
> > May 13 15:53:10 blade02 kernel: [93297.784204] CPU: 10 PID: 299 Comm:
> > kworker/10:1 Tainted: G        W     3.13.0-52-generic #86-Ubuntu
> > May 13 15:53:10 blade02 kernel: [93297.784207] Hardware name: Dell Inc.
> > PowerEdge M520/050YHY, BIOS 2.1.3 01/20/2014
> > May 13 15:53:10 blade02 kernel: [93297.784221] Workqueue: ceph-msgr
> con_work
> > [libceph]
> > May 13 15:53:10 blade02 kernel: [93297.784225]  0000000000000009
> > ffff880801093a28 ffffffff8172266e 0000000000000000
> > May 13 15:53:10 blade02 kernel: [93297.784233]  ffff880801093a60
> > ffffffff810677fd 00000000ffffffea 0000000000000036
> > May 13 15:53:10 blade02 kernel: [93297.784239]  0000000000000000
> > 0000000000000000 ffffc9001b73f9d8 ffff880801093a70
> > May 13 15:53:10 blade02 kernel: [93297.784246] Call Trace:
> > May 13 15:53:10 blade02 kernel: [93297.784257]  [<ffffffff8172266e>]
> > dump_stack+0x45/0x56
> > May 13 15:53:10 blade02 kernel: [93297.784264]  [<ffffffff810677fd>]
> > warn_slowpath_common+0x7d/0xa0
> > May 13 15:53:10 blade02 kernel: [93297.784269]  [<ffffffff810678da>]
> > warn_slowpath_null+0x1a/0x20
> > May 13 15:53:10 blade02 kernel: [93297.784280]  [<ffffffffa046facd>]
> > fill_inode.isra.8+0x9ed/0xa00 [ceph]
> > May 13 15:53:10 blade02 kernel: [93297.784290]  [<ffffffffa046e3cd>] ?
> > ceph_alloc_inode+0x1d/0x4e0 [ceph]
> > May 13 15:53:10 blade02 kernel: [93297.784302]  [<ffffffffa04704cf>]
> > ceph_readdir_prepopulate+0x27f/0x6d0 [ceph]
> > May 13 15:53:10 blade02 kernel: [93297.784318]  [<ffffffffa048a704>]
> > handle_reply+0x854/0xc70 [ceph]
> > May 13 15:53:10 blade02 kernel: [93297.784331]  [<ffffffffa048c3f7>]
> > dispatch+0xe7/0xa90 [ceph]
> > May 13 15:53:10 blade02 kernel: [93297.784342]  [<ffffffffa02a4a78>] ?
> > ceph_tcp_recvmsg+0x48/0x60 [libceph]
> > May 13 15:53:10 blade02 kernel: [93297.784354]  [<ffffffffa02a7a9b>]
> > try_read+0x4ab/0x10d0 [libceph]
> > May 13 15:53:10 blade02 kernel: [93297.784365]  [<ffffffffa02a9418>] ?
> > try_write+0x9a8/0xdb0 [libceph]
> > May 13 15:53:10 blade02 kernel: [93297.784373]  [<ffffffff8101bc23>] ?
> > native_sched_clock+0x13/0x80
> > May 13 15:53:10 blade02 kernel: [93297.784379]  [<ffffffff8109d585>] ?
> > sched_clock_cpu+0xb5/0x100
> > May 13 15:53:10 blade02 kernel: [93297.784390]  [<ffffffffa02a98d9>]
> > con_work+0xb9/0x640 [libceph]
> > May 13 15:53:10 blade02 kernel: [93297.784398]  [<ffffffff81083aa2>]
> > process_one_work+0x182/0x450
> > May 13 15:53:10 blade02 kernel: [93297.784403]  [<ffffffff81084891>]
> > worker_thread+0x121/0x410
> > May 13 15:53:10 blade02 kernel: [93297.784409]  [<ffffffff81084770>] ?
> > rescuer_thread+0x430/0x430
> > May 13 15:53:10 blade02 kernel: [93297.784414]  [<ffffffff8108b5d2>]
> > kthread+0xd2/0xf0
> > May 13 15:53:10 blade02 kernel: [93297.784420]  [<ffffffff8108b500>] ?
> > kthread_create_on_node+0x1c0/0x1c0
> > May 13 15:53:10 blade02 kernel: [93297.784426]  [<ffffffff817330cc>]
> > ret_from_fork+0x7c/0xb0
> > May 13 15:53:10 blade02 kernel: [93297.784431]  [<ffffffff8108b500>] ?
> > kthread_create_on_node+0x1c0/0x1c0
> > May 13 15:53:10 blade02 kernel: [93297.784434] ---[ end trace
> > 05d3f5ee1f31bc67 ]---
> > May 13 15:53:10 blade02 kernel: [93297.784437] ceph: fill_inode badness
> on
> > ffff8807f7eaa5c0
>
> I don't follow the kernel stuff too closely, but the CephFS kernel
> client is still improving quite rapidly and 3.13 is old at this point.
> You could try upgrading to something newer.
> Zheng might also know what's going on and if it's been fixed.
> -Greg
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to