Hi, This is indeed the expected behaviour.
The in/out are used as a 2nd factor weight in the OSD placement algorithm. So crush weight 1, weight 0 is not equivalent to crush weight 0. The correct way to decommission OSDs / Hosts is to decrease the crush weight. Cheers, Dan On Fri, Dec 3, 2021 at 1:08 PM [email protected] <[email protected]> wrote: > > Dear Cephers, > > I had to remove a failed OSD server node, and what i did is the following > 1) First marked all OSDs on that (to be removed) server down and out > 2) Secondly, let Ceph do backfilling and rebalancing, and wait for completing > 3) Now i have full redundancy, so i delete thoses removed OSDs from the > cluster, e.g. ceph osd cursh remove osd.${OSD_NUM} > 4) To my surprise, after removing those already-out OSDs from the cluster, i > was seeing a tons of PG remapped and once again BACKFILLING/REBALANCING > > What is major problems of the above procedure, which caused double > BACKFILLING/REBALANCING? The root cause could be on those "already-out" OSDs > but "not-yet being-removed" form CRUSH"? I previous thought those "out" OSDs > would not impact CRUSH, but it seems i am wrong. > > Any suggestions, comments, explanations are highly appreciated, > > Best regards, > > Samuel > > > > [email protected] > _______________________________________________ > ceph-users mailing list -- [email protected] > To unsubscribe send an email to [email protected] _______________________________________________ ceph-users mailing list -- [email protected] To unsubscribe send an email to [email protected]
