Hello everyone,
something very strange is driving me crazy with CephFS (kernel driver).
I copy a large directory on the CephFS from one node. If I try to perform a
'time ls -alR' on that directory it gets executed in less than one second.
If I try to do the same 'time ls -alR' from another node it takes several
minutes. No matter how many times I repeat the command, the speed is always
abysmal. The ls works fine on the node where the initial copy was executed
from. This happens with any directory I have tried, no matter what kind of
data is inside.
After lots of experimenting I have found that in order to have fast ls
speed for that dir from every node I need to flush the Linux cache on the
original node:
echo 3 > /proc/sys/vm/drop_caches
Unmounting and remounting the CephFS on that node does the trick too.
Anyone has a clue about what's happening here? Could this be a problem with
the writeback fscache for the CephFS?
Any help appreciated! Thanks and regards. :)
# uname -r
3.10.80-1.el6.elrepo.x86_64
# ceph -v
ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3)
# ceph -s
cluster f9ffbbd7-186b-483a-96ea-90cdadb81f2a
health HEALTH_OK
monmap e1: 3 mons at {[omissis]}
election epoch 60, quorum 0,1,2 [omissis]
mdsmap e59: 1/1/1 up {0=[omissis]=up:active}, 2 up:standby
osdmap e146: 2 osds: 2 up, 2 in
pgmap v122287: 256 pgs, 2 pools, 30709 MB data, 75239 objects
62432 MB used, 860 GB / 921 GB avail
256 active+clean
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com