This has happened me before but in virtual machine environment. The VM was KVM and storage was RBD. My problem was a bad cable in network.
You should check following details: 1-) Do you use any kind of hardware raid configuration? (Raid 0, 5 or 10) Ceph does not work well on hardware raid systems. You should use raid cards in HBA (non-raid) mode and let raid card pass-throughput the disk. 2-) Check your network connections It mas seem a obvious solution but believe me network is one of the top rated culprit in Ceph environments. 3-) If you are using SSD disk, make sure you use non-raid configuration. On Tue, Feb 23, 2016 at 10:55 PM, fangchen sun <[email protected]> wrote: > Dear all: > > I have a ceph object storage cluster with 143 osd and 7 radosgw, and > choose XFS as the underlying file system. > I recently ran into a problem that sometimes a osd is marked down when the > returned value of the function "chain_setxattr()" is -117. I only umount > the disk and repair it with "xfs_repair". > > os: centos 6.5 > kernel version: 2.6.32 > > the log for dmesg command: > [41796028.532225] Pid: 1438740, comm: ceph-osd Not tainted > 2.6.32-925.431.23.3.letv.el6.x86_64 #1 > [41796028.532227] Call Trace: > [41796028.532255] [<ffffffffa01e1e5f>] ? xfs_error_report+0x3f/0x50 [xfs] > [41796028.532276] [<ffffffffa01d506a>] ? xfs_da_read_buf+0x2a/0x30 [xfs] > [41796028.532296] [<ffffffffa01e1ece>] ? xfs_corruption_error+0x5e/0x90 > [xfs] > [41796028.532316] [<ffffffffa01d4f4c>] ? xfs_da_do_buf+0x6cc/0x770 [xfs] > [41796028.532335] [<ffffffffa01d506a>] ? xfs_da_read_buf+0x2a/0x30 [xfs] > [41796028.532359] [<ffffffffa0206fc7>] ? kmem_zone_alloc+0x77/0xf0 [xfs] > [41796028.532380] [<ffffffffa01d506a>] ? xfs_da_read_buf+0x2a/0x30 [xfs] > [41796028.532399] [<ffffffffa01bc481>] ? xfs_attr_leaf_addname+0x61/0x3d0 > [xfs] > [41796028.532426] [<ffffffffa01bc481>] ? xfs_attr_leaf_addname+0x61/0x3d0 > [xfs] > [41796028.532455] [<ffffffffa01ff187>] ? xfs_trans_add_item+0x57/0x70 > [xfs] > [41796028.532476] [<ffffffffa01cc208>] ? xfs_bmbt_get_all+0x18/0x20 [xfs] > [41796028.532495] [<ffffffffa01bcbb4>] ? xfs_attr_set_int+0x3c4/0x510 > [xfs] > [41796028.532517] [<ffffffffa01d4f5b>] ? xfs_da_do_buf+0x6db/0x770 [xfs] > [41796028.532536] [<ffffffffa01bcd81>] ? xfs_attr_set+0x81/0x90 [xfs] > [41796028.532560] [<ffffffffa0216cc3>] ? __xfs_xattr_set+0x43/0x60 [xfs] > [41796028.532584] [<ffffffffa0216d31>] ? xfs_xattr_user_set+0x11/0x20 > [xfs] > [41796028.532592] [<ffffffff811aee92>] ? generic_setxattr+0xa2/0xb0 > [41796028.532596] [<ffffffff811b134e>] ? __vfs_setxattr_noperm+0x4e/0x160 > [41796028.532600] [<ffffffff81196b77>] ? inode_permission+0xa7/0x100 > [41796028.532604] [<ffffffff811b151c>] ? vfs_setxattr+0xbc/0xc0 > [41796028.532607] [<ffffffff811b15f0>] ? setxattr+0xd0/0x150 > [41796028.532612] [<ffffffff8105af80>] ? __dequeue_entity+0x30/0x50 > [41796028.532617] [<ffffffff8100988e>] ? __switch_to+0x26e/0x320 > [41796028.532621] [<ffffffff8118aec0>] ? __sb_start_write+0x80/0x120 > [41796028.532626] [<ffffffff8152912e>] ? thread_return+0x4e/0x760 > [41796028.532630] [<ffffffff811b171d>] ? sys_fsetxattr+0xad/0xd0 > [41796028.532633] [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b > [41796028.532636] XFS (sdi1): Corruption detected. Unmount and run > xfs_repair > > Any comments will be much appreciated! > > Best Regards! > sunspot > > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
