I have a ceph luminous cluster running on CentOS7 nodes.
This cluster has 50 OSDs, all with the same size and all with the same
weight.

Since I noticed that there was a quite "unfair" usage of OSD nodes (some
used at 30 %, some used at 70 %) I tried to activate the balancer.

But the balancer doesn't start I guess because of this problem:

[root@ceph-mon-01 ~]# ceph osd crush weight-set create-compat
Error EPERM: crush map contains one or more bucket(s) that are not straw2


So I issued the command to convert from straw to straw2 (all the clients
are running luminous):


[root@ceph-mon-01 ~]# ceph osd crush set-all-straw-buckets-to-straw2
Error EINVAL: new crush map requires client version hammer but
require_min_compat_client is firefly
[root@ceph-mon-01 ~]# ceph osd set-require-min-compat-client jewel
set require_min_compat_client to jewel
[root@ceph-mon-01 ~]# ceph osd crush set-all-straw-buckets-to-straw2
[root@ceph-mon-01 ~]#


After having issued the command, the cluster went in WARNING state because
~ 12 % objects were misplaced.

Is this normal ?
I read somewhere that the migration from straw to straw2 should trigger a
data migration only if the OSDs have different sizes, which is not my case.


The cluster is still recovering, but what is worrying me is that it looks
like that data are being moved to the most used OSDs and the MAX_AVAIL
value is decreasing quite quickly.

I hope that the recovery can finish without causing problems: then I will
immediately activate the balancer.

But, if some OSDs are getting too full, is it safe to decrease their
weights  while the cluster is still being recovered ?

Thanks a lot for your help
Of course I can provide other info, if needed


Cheers, Massimo
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to