Hi,
By my observation, I don't think that marking it out before crush rm would be 
any safer.

Normally what I do (when decommissioning an OSD or whole server) is stop the 
OSD process, then crush rm / osd rm / auth del the OSD shortly afterwards, 
before the down out interval expires. Since the OSD is already stopped when I 
crush rm it, there shouldn't be any clients affected by that operation.

The main thing to note is that crush rm of an out or DNE OSD will trigger 
backfilling, even though intuitively that shouldn't require any data movement. 
This was confirmed by the developers as a sort of side effect of the current 
CRUSH implementation.

Cheers, Dan

On Apr 3, 2014 4:00 AM, Chad William Seys <[email protected]> wrote:
Hi All,
  Slide 19 of Ceph at CERN presentation
http://www.slideshare.net/Inktank_Ceph/scaling-ceph-at-cern
says that when removing an OSD from Ceph it is faster to
just "ceph osd crush rm <osd id>" rather than marking the
osd as "out", waiting for data migration, and then "rm" the
OSD.
  The reason they give is that "out then rm" leads to two modifications
to CRUSH and two data migrations, which takes more time.
  I have observed this to be true!

  However, is it safer to do the "out then rm"?  Doesn't just doing an "rm"
make replicas unavailable?

(BTW, they used replica = 4, so maybe they were less concerned!)

Thanks!
Chad.
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to