Hi, By my observation, I don't think that marking it out before crush rm would be any safer.
Normally what I do (when decommissioning an OSD or whole server) is stop the OSD process, then crush rm / osd rm / auth del the OSD shortly afterwards, before the down out interval expires. Since the OSD is already stopped when I crush rm it, there shouldn't be any clients affected by that operation. The main thing to note is that crush rm of an out or DNE OSD will trigger backfilling, even though intuitively that shouldn't require any data movement. This was confirmed by the developers as a sort of side effect of the current CRUSH implementation. Cheers, Dan On Apr 3, 2014 4:00 AM, Chad William Seys <[email protected]> wrote: Hi All, Slide 19 of Ceph at CERN presentation http://www.slideshare.net/Inktank_Ceph/scaling-ceph-at-cern says that when removing an OSD from Ceph it is faster to just "ceph osd crush rm <osd id>" rather than marking the osd as "out", waiting for data migration, and then "rm" the OSD. The reason they give is that "out then rm" leads to two modifications to CRUSH and two data migrations, which takes more time. I have observed this to be true! However, is it safer to do the "out then rm"? Doesn't just doing an "rm" make replicas unavailable? (BTW, they used replica = 4, so maybe they were less concerned!) Thanks! Chad. _______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
