Hi, If you are using the kernel client, I would suggest trying something newer than 3.10.x. I ran into this issue in the past, but it was fixed by updating my kernel to something newer. You may want to check the OS recommendations page as well: http://docs.ceph.com/docs/master/start/os-recommendations/ <http://docs.ceph.com/docs/master/start/os-recommendations/>
ELRepo maintains mainline RPMs for EL6 and EL7: http://elrepo.org/tiki/kernel-ml Alternatively, you could try the FUSE client. —Lincoln > On Mar 23, 2016, at 11:12 AM, FaHui Lin <[email protected]> wrote: > > Dear Ceph experts, > > We meet a nasty problem with our CephFS from time to time: > > When we try to list a directory under CephFS, some files or directories do > not show up. For example: > > This is the complete directory content: > # ll /cephfs/ies/home/mika > drwxr-xr-x 1 10035 100001 1559018781 Feb 2 07:43 dir-A > drwxr-xr-x 1 10035 100001 9061906 Apr 15 2015 dir-B > -rw-r--r-- 1 10035 100001 130750361 Aug 6 2015 file-1 > -rw-r--r-- 1 10035 100001 72640608 Apr 15 2015 file-2 > > But sometimes we get only part of files/directories when listing, say: > # ll /cephfs/ies/home/mika > drwxr-xr-x 1 10035 100001 1559018781 Feb 2 07:43 dir-A > -rw-r--r-- 1 10035 100001 72640608 Apr 15 2015 file-2 > Here dir-B & file-1 missing. > > We found the files themselves are still intact since we and still see them on > another node mounting the same cephfs, or just at another time. So we think > this is a metadata problem. > > One thing we found interesting(?) is that remounting cephfs or restart MDS > service will NOT help, but creating a new file under the directory may help: > > # ll /cephfs/ies/home/mika > drwxr-xr-x 1 10035 100001 1559018781 Feb 2 07:43 dir-A > -rw-r--r-- 1 10035 100001 72640608 Apr 15 2015 file-2 > # touch /cephfs/ies/home/mika/file-tmp > # ll /cephfs/ies/home/mika > drwxr-xr-x 1 10035 100001 1559018781 Feb 2 07:43 dir-A > drwxr-xr-x 1 10035 100001 9061906 Apr 15 2015 dir-B > -rw-r--r-- 1 10035 100001 130750361 Aug 6 2015 file-1 > -rw-r--r-- 1 10035 100001 72640608 Apr 15 2015 file-2 > -rw-r--r-- 1 root root 0 Mar 23 15:34 file-tmp > > > Strangely, when this happens, ceph cluster health usually shows HEALTH_OK, > and there's no significant errors in MDS or other service logs. > > One thing we tried to improve is increasing MDS mds_cache_size to be 1600000 > (16x default value), which does help to alleviate warnings like "mds0: Client > failing to respond to cache pressure", but still cannot solve the file > metadata missing problem. > > Here's our ceph server info: > > # ceph -s > cluster d15a2cdb-354c-4bcd-a246-23521f1a7122 > health HEALTH_OK > monmap e1: 3 mons at > {as-ceph01=117.103.102.128:6789/0,as-ceph02=117.103.103.93:6789/0,as-ceph03=117.103.109.124:6789/0} > election epoch 6, quorum 0,1,2 as-ceph01,as-ceph02,as-ceph03 > mdsmap e144: 1/1/1 up {0=as-ceph02=up:active}, 1 up:standby > osdmap e178: 10 osds: 10 up, 10 in > flags sortbitwise > pgmap v105168: 256 pgs, 4 pools, 505 GB data, 1925 kobjects > 1083 GB used, 399 TB / 400 TB avail > 256 active+clean > client io 614 B/s rd, 0 op/s > > # ceph --version > ceph version 9.2.1 (752b6a3020c3de74e07d2a8b4c5e48dab5a6b6fd) > > (We also met the same problem on Hammer release) > > # uname -r > 3.10.0-327.10.1.el7.x86_64 > > We're using centOS7 servers. > > # ceph daemon mds.as-ceph02 perf dump > { > "mds": { > "request": 76066, > "reply": 76066, > "reply_latency": { > "avgcount": 76066, > "sum": 61.151796797 > }, > "forward": 0, > "dir_fetch": 1050, > "dir_commit": 1017, > "dir_split": 0, > "inode_max": 1600000, > "inodes": 130657, > "inodes_top": 110882, > "inodes_bottom": 19775, > "inodes_pin_tail": 0, > "inodes_pinned": 99670, > "inodes_expired": 0, > "inodes_with_caps": 99606, > "caps": 105119, > "subtrees": 2, > "traverse": 81583, > "traverse_hit": 74090, > "traverse_forward": 0, > "traverse_discover": 0, > "traverse_dir_fetch": 24, > "traverse_remote_ino": 0, > "traverse_lock": 80, > "load_cent": 7606600, > "q": 0, > "exported": 0, > "exported_inodes": 0, > "imported": 0, > "imported_inodes": 0 > }, > "mds_cache": { > "num_strays": 120, > "num_strays_purging": 0, > "num_strays_delayed": 0, > "num_purge_ops": 0, > "strays_created": 17276, > "strays_purged": 17155, > "strays_reintegrated": 1, > "strays_migrated": 0, > "num_recovering_processing": 0, > "num_recovering_enqueued": 0, > "num_recovering_prioritized": 0, > "recovery_started": 0, > "recovery_completed": 0 > }, > "mds_log": { > "evadd": 116253, > "evex": 123148, > "evtrm": 123148, > "ev": 22378, > "evexg": 0, > "evexd": 17, > "segadd": 157, > "segex": 157, > "segtrm": 157, > "seg": 31, > "segexg": 0, > "segexd": 1, > "expos": 53624211952, > "wrpos": 53709306372, > "rdpos": 53354921818, > "jlat": 0 > }, > "mds_mem": { > "ino": 129334, > "ino+": 146489, > "ino-": 17155, > "dir": 3961, > "dir+": 4741, > "dir-": 780, > "dn": 130657, > "dn+": 163760, > "dn-": 33103, > "cap": 105119, > "cap+": 122281, > "cap-": 17162, > "rss": 444444, > "heap": 50108, > "malloc": 402511, > "buf": 0 > }, > "mds_server": { > "handle_client_request": 76066, > "handle_slave_request": 0, > "handle_client_session": 176954, > "dispatch_client_request": 80245, > "dispatch_server_request": 0 > }, > "objecter": { > "op_active": 0, > "op_laggy": 0, > "op_send": 61860, > "op_send_bytes": 0, > "op_resend": 0, > "op_ack": 7719, > "op_commit": 54141, > "op": 61860, > "op_r": 7719, > "op_w": 54141, > "op_rmw": 0, > "op_pg": 0, > "osdop_stat": 119, > "osdop_create": 26905, > "osdop_read": 21, > "osdop_write": 8537, > "osdop_writefull": 254, > "osdop_append": 0, > "osdop_zero": 1, > "osdop_truncate": 0, > "osdop_delete": 17325, > "osdop_mapext": 0, > "osdop_sparse_read": 0, > "osdop_clonerange": 0, > "osdop_getxattr": 7695, > "osdop_setxattr": 53810, > "osdop_cmpxattr": 0, > "osdop_rmxattr": 0, > "osdop_resetxattrs": 0, > "osdop_tmap_up": 0, > "osdop_tmap_put": 0, > "osdop_tmap_get": 0, > "osdop_call": 0, > "osdop_watch": 0, > "osdop_notify": 0, > "osdop_src_cmpxattr": 0, > "osdop_pgls": 0, > "osdop_pgls_filter": 0, > "osdop_other": 1111, > "linger_active": 0, > "linger_send": 0, > "linger_resend": 0, > "linger_ping": 0, > "poolop_active": 0, > "poolop_send": 0, > "poolop_resend": 0, > "poolstat_active": 0, > "poolstat_send": 0, > "poolstat_resend": 0, > "statfs_active": 0, > "statfs_send": 0, > "statfs_resend": 0, > "command_active": 0, > "command_send": 0, > "command_resend": 0, > "map_epoch": 178, > "map_full": 0, > "map_inc": 3, > "osd_sessions": 55, > "osd_session_open": 182, > "osd_session_close": 172, > "osd_laggy": 0, > "omap_wr": 1972, > "omap_rd": 2102, > "omap_del": 40 > }, > "throttle-msgr_dispatch_throttler-mds": { > "val": 0, > "max": 104857600, > "get": 450630, > "get_sum": 135500995, > "get_or_fail_fail": 0, > "get_or_fail_success": 0, > "take": 0, > "take_sum": 0, > "put": 450630, > "put_sum": 135500995, > "wait": { > "avgcount": 0, > "sum": 0.000000000 > } > }, > "throttle-objecter_bytes": { > "val": 0, > "max": 104857600, > "get": 0, > "get_sum": 0, > "get_or_fail_fail": 0, > "get_or_fail_success": 0, > "take": 61860, > "take_sum": 453992030, > "put": 44433, > "put_sum": 453992030, > "wait": { > "avgcount": 0, > "sum": 0.000000000 > } > }, > "throttle-objecter_ops": { > "val": 0, > "max": 1024, > "get": 0, > "get_sum": 0, > "get_or_fail_fail": 0, > "get_or_fail_success": 0, > "take": 61860, > "take_sum": 61860, > "put": 61860, > "put_sum": 61860, > "wait": { > "avgcount": 0, > "sum": 0.000000000 > } > } > } > > > This problem troubles us much since our cephfs is serving as a network shared > file-system of 100+ computing nodes (mounting with mount.ceph), and it causes > jobs running with I/O on cephfs to fail. > > > I'd like to ask that: > > 1) What could be the main cause of this problem? Or, how can we trace the > problem? > However, we cannot really reproduce the problem on purpose. It just happens > occasionally. > > 2) Since our cephfs is now for production usage, is there any comment for us > to improve the stability? > We have 100+ computing nodes requiring a shared file-system containing tens > of millions of files and I wonder if the MDS server (only one) could handle > them well. > Should we use ceph-fuse mount or ceph mount? Should we use only 3~5 servers > mounting cephfs and then share the mountpoint to other nodes with NFS, in > order to mitigate the loading of MDS server? What is a proper cluster > structure using cephfs? > > Any advice or comment will be appreciated. Thank you. > > Best Regards, > FaHui > > > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
