Hi,

We have a ceph cluster with rack as failure domain but the racks are so
imbalanced due to which we are not able to utilize the maximum of storage
allocated as some odd's in small racks are filling up too fast and causing
ceph to go into warning state and near_full_ratio being triggered.

We are planning to restructure the entire crushmap with Rows being the
failure domain instead of Racks so that each row will have the same number
of hosts irrespective of how many Racks we have in each Row. We are using
3X replica in our ceph cluster

Current:
Rack1 has 4 hosts
Rack 2 has 2 hosts
Rack 3 has 3 hosts
Rack 4 has 6 hosts
Rack 5 has 7 hosts
Rack 6 has 2 hosts
Rack 7 has 3 hosts

Future: With each  Row having 9 hosts,

Row_A with Rack 1 + Rack 2 + Rack 3 = 9 Hosts
Row_B with Rack 4 + Rack 7 = 9 Hosts
Row_C with Rack 5 + Rack 6 = 9 Hosts

The question is how can we safely do that without triggering too much
rebalance?
I can add empty rows to the crushmap and change failure domain to row
without any rebalancing but when I move a rack under a row it is triggering
50-60% of rebalance and even the cluster is going completely out of (error:
connecting to cluster). How can we avoid it?

Thanks,
*Pardhiv Karri*
"Rise and Rise again until LAMBS become LIONS"
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to