Hi, We have a ceph cluster with rack as failure domain but the racks are so imbalanced due to which we are not able to utilize the maximum of storage allocated as some odd's in small racks are filling up too fast and causing ceph to go into warning state and near_full_ratio being triggered.
We are planning to restructure the entire crushmap with Rows being the failure domain instead of Racks so that each row will have the same number of hosts irrespective of how many Racks we have in each Row. We are using 3X replica in our ceph cluster Current: Rack1 has 4 hosts Rack 2 has 2 hosts Rack 3 has 3 hosts Rack 4 has 6 hosts Rack 5 has 7 hosts Rack 6 has 2 hosts Rack 7 has 3 hosts Future: With each Row having 9 hosts, Row_A with Rack 1 + Rack 2 + Rack 3 = 9 Hosts Row_B with Rack 4 + Rack 7 = 9 Hosts Row_C with Rack 5 + Rack 6 = 9 Hosts The question is how can we safely do that without triggering too much rebalance? I can add empty rows to the crushmap and change failure domain to row without any rebalancing but when I move a rack under a row it is triggering 50-60% of rebalance and even the cluster is going completely out of (error: connecting to cluster). How can we avoid it? Thanks, *Pardhiv Karri* "Rise and Rise again until LAMBS become LIONS"
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com