Hi, I'm not sure if this question makes sense, but ... Will perform a client-side rate control (limiting the number of requests sent per second) help in avoiding a MDS crash?
I'm currently trying to get a baseline metadata performance of cephfs with multiple *active* mds servers and directory splitting. The plan is to increase the number of mds servers from 8 to 16, or more. The problem I'm having right now is that: 1) My testing program will keep creating empty files under multiple directories in ceph fs, and some mds servers may crash in the middle of a test run. However, I think it is kind of unfair to report the throughput that I'm able to get before the crash, since in the beginning only 1 mds server is doing the work. It takes time for ceph to fully balance all its *active* mds servers. The problem is that cephfs might crash before load balance is achieved. So If we let the fs clients to slowly create files until all active mds servers get a share on the file system tree, would this hopefully reduce the odds of a crash? 2) How do we know whether a set of active mds servers are load balanced? What I'm doing currently is to check the CPU on these mds servers. Are there any better ways to do this? Cheers, -- Qing _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com