On Thursday, April 3, 2014, Chad Seys <[email protected]> wrote:
> On Thursday, April 03, 2014 07:57:58 Dan Van Der Ster wrote: > > Hi, > > By my observation, I don't think that marking it out before crush rm > would > > be any safer. > > > > Normally what I do (when decommissioning an OSD or whole server) is stop > > the OSD process, then crush rm / osd rm / auth del the OSD shortly > > afterwards, > > Huh! I am using a replication = 2, so I'd be worried about the other drive > dying before a replication can occur. > > For my on the edge cluster, it seems safer to mark the OSD out, then remove > from CRUSH, then turn off the OSD daemon. > > Looks like when an OSD is marked out, reweight is set to 0. Is this the > same > as weight being set to 0? I assume in either case the data is still > available > to be replicated elsewhere. > > If one removes an OSD from CRUSH but not turn off the OSD, is the data > available for replication. (I would guess "no".) Actually, that's the point -- if you mark an OSD out (which you are implicitly doing by removing it from the crush map) but it is not down, the new host OSDs will use it as a data source. Of course, you'll need to wait to run "OSD rm" until the data movement is done for that. :) -Greg > > > The main thing to note is that crush rm of an out or DNE OSD will trigger > > backfilling, even though intuitively that shouldn't require any data > > movement. This was confirmed by the developers as a sort of side effect > of > > the current CRUSH implementation. > > I guess changing the CRUSH does not preserve current data locations (like a > non-stable sorting algorithm). > > Thanks! > Chad. > > > > > > > Cheers, Dan > > > > On Apr 3, 2014 4:00 AM, Chad William Seys > > <[email protected]<javascript:;>> > wrote: > > Hi All, > > Slide 19 of Ceph at CERN presentation > > http://www.slideshare.net/Inktank_Ceph/scaling-ceph-at-cern > > says that when removing an OSD from Ceph it is faster to > > just "ceph osd crush rm <osd id>" rather than marking the > > osd as "out", waiting for data migration, and then "rm" the > > OSD. > > The reason they give is that "out then rm" leads to two modifications > > to CRUSH and two data migrations, which takes more time. > > I have observed this to be true! > > > > However, is it safer to do the "out then rm"? Doesn't just doing an > "rm" > > make replicas unavailable? > > > > (BTW, they used replica = 4, so maybe they were less concerned!) > > > > Thanks! > > Chad. > > _______________________________________________ > > ceph-users mailing list > > [email protected] <javascript:;> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ > ceph-users mailing list > [email protected] <javascript:;> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Software Engineer #42 @ http://inktank.com | http://ceph.com
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
