Hi - We are currently evaluating CephFS's metadata scalability and performance. One important feature of CephFS is its support for running multiple "active" mds instances and partitioning huge directories into small shards.
We use mdtest to simulate workloads where multiple parallel client processes will keep inserting empty files into several large directories. We found that CephFS is only able to run for the first 5-10 mins, and then stop making progress -- the clients' "creat" call no longer return. We were using Ceph 0.72 and Ubuntu 12.10 with kernel 3.6.6. Our setup consisted of 8 osds, 3 mds, and 1 mon. All mds were active, instead of standby, and they were all configured to split directories once the directory size is greater than 2k. We kernel (not fuse) mounted CephFS on all 8 osd nodes. To test CephFS, we launched 64 client processes on 8 osd nodes (8 procs per osd). Each client would create 1 directory and then insert 5k empty files into that directory. In total 64 directories and 320k files would be created. CephFS gave an avg throughput of 300~1k for the first 5 minutes, and then stopped making any progress. What might go wrong? If each client insert 200 files, instead of 5k, then CephFS could finish the workload with 1.5K ops/s. If each client insert 1k files, then ~500 ops/s If 2k files (the split threshold), then ~400 ops/s Are these numbers reasonable? -- Qing _______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
