This has happened me before but in virtual machine environment.

The VM was KVM and storage was RBD. My problem was a bad cable in network.

You should check following details:

1-) Do you use any kind of hardware raid configuration? (Raid 0, 5 or 10)

Ceph does not work well on hardware raid systems. You should use raid cards
in HBA (non-raid) mode and let raid card pass-throughput the disk.

2-) Check your network connections

It mas seem a obvious solution but  believe me network is one of the top
rated culprit in Ceph environments.

3-) If you are using SSD disk, make sure you use non-raid configuration.



On Tue, Feb 23, 2016 at 10:55 PM, fangchen sun <[email protected]>
wrote:

> Dear all:
>
> I have a ceph object storage cluster with 143 osd and 7 radosgw, and
> choose XFS as the underlying file system.
> I recently ran into a problem that sometimes a osd is marked down when the
> returned value of the function "chain_setxattr()" is -117. I only umount
> the disk and repair it with "xfs_repair".
>
> os: centos 6.5
> kernel version: 2.6.32
>
> the log for dmesg command:
> [41796028.532225] Pid: 1438740, comm: ceph-osd Not tainted
> 2.6.32-925.431.23.3.letv.el6.x86_64 #1
> [41796028.532227] Call Trace:
> [41796028.532255]  [<ffffffffa01e1e5f>] ? xfs_error_report+0x3f/0x50 [xfs]
> [41796028.532276]  [<ffffffffa01d506a>] ? xfs_da_read_buf+0x2a/0x30 [xfs]
> [41796028.532296]  [<ffffffffa01e1ece>] ? xfs_corruption_error+0x5e/0x90
> [xfs]
> [41796028.532316]  [<ffffffffa01d4f4c>] ? xfs_da_do_buf+0x6cc/0x770 [xfs]
> [41796028.532335]  [<ffffffffa01d506a>] ? xfs_da_read_buf+0x2a/0x30 [xfs]
> [41796028.532359]  [<ffffffffa0206fc7>] ? kmem_zone_alloc+0x77/0xf0 [xfs]
> [41796028.532380]  [<ffffffffa01d506a>] ? xfs_da_read_buf+0x2a/0x30 [xfs]
> [41796028.532399]  [<ffffffffa01bc481>] ? xfs_attr_leaf_addname+0x61/0x3d0
> [xfs]
> [41796028.532426]  [<ffffffffa01bc481>] ? xfs_attr_leaf_addname+0x61/0x3d0
> [xfs]
> [41796028.532455]  [<ffffffffa01ff187>] ? xfs_trans_add_item+0x57/0x70
> [xfs]
> [41796028.532476]  [<ffffffffa01cc208>] ? xfs_bmbt_get_all+0x18/0x20 [xfs]
> [41796028.532495]  [<ffffffffa01bcbb4>] ? xfs_attr_set_int+0x3c4/0x510
> [xfs]
> [41796028.532517]  [<ffffffffa01d4f5b>] ? xfs_da_do_buf+0x6db/0x770 [xfs]
> [41796028.532536]  [<ffffffffa01bcd81>] ? xfs_attr_set+0x81/0x90 [xfs]
> [41796028.532560]  [<ffffffffa0216cc3>] ? __xfs_xattr_set+0x43/0x60 [xfs]
> [41796028.532584]  [<ffffffffa0216d31>] ? xfs_xattr_user_set+0x11/0x20
> [xfs]
> [41796028.532592]  [<ffffffff811aee92>] ? generic_setxattr+0xa2/0xb0
> [41796028.532596]  [<ffffffff811b134e>] ? __vfs_setxattr_noperm+0x4e/0x160
> [41796028.532600]  [<ffffffff81196b77>] ? inode_permission+0xa7/0x100
> [41796028.532604]  [<ffffffff811b151c>] ? vfs_setxattr+0xbc/0xc0
> [41796028.532607]  [<ffffffff811b15f0>] ? setxattr+0xd0/0x150
> [41796028.532612]  [<ffffffff8105af80>] ? __dequeue_entity+0x30/0x50
> [41796028.532617]  [<ffffffff8100988e>] ? __switch_to+0x26e/0x320
> [41796028.532621]  [<ffffffff8118aec0>] ? __sb_start_write+0x80/0x120
> [41796028.532626]  [<ffffffff8152912e>] ? thread_return+0x4e/0x760
> [41796028.532630]  [<ffffffff811b171d>] ? sys_fsetxattr+0xad/0xd0
> [41796028.532633]  [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
> [41796028.532636] XFS (sdi1): Corruption detected. Unmount and run
> xfs_repair
>
> Any comments will be much appreciated!
>
> Best Regards!
> sunspot
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to