Den tors 5 dec. 2019 kl 00:28 skrev Milan Kupcevic <
milan_kupce...@harvard.edu>:

>
>
> There is plenty of space to take more than a few failed nodes. But the
> question was about what is going on inside a node with a few failed
> drives. Current Ceph behavior keeps increasing number of placement
> groups on surviving drives inside the same node. It does not spread them
> across the cluster. So, lets get back to he original question. Shall
> host weight auto reduce on hdd failure, or not?
>

If the OSDs are still in the crush map, with non-zero weights, they will
add "value" to the host, and hence the host gets as much PGs as the sum of
the crush values (ie, sizes) says it can bear.
If some of the OSDs have zero OSD-reweight values, they will not take a
part of the burden, but rather let the "surviving" OSDs on the host take
more load, until the cluster decides the broken OSDs are down and out, at
which point the cluster rebalances according to the general algorithm which
should(*) even it out, letting the OSD hosts with fewer OSDs have less PGs
and hence less data.

*) There are reports of Nautilus (only, as far as I remember) having weird
placement ideas that tend to fill up OSDs that already have much data,
leaving it to the ceph admin to force values down in order to
not go over 85% at which point some rebalancing ops will stop.


-- 
May the most significant bit of your life be positive.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to