On Thursday, April 3, 2014, Chad Seys <[email protected]> wrote:

> On Thursday, April 03, 2014 07:57:58 Dan Van Der Ster wrote:
> > Hi,
> > By my observation, I don't think that marking it out before crush rm
> would
> > be any safer.
> >
> > Normally what I do (when decommissioning an OSD or whole server) is stop
> > the OSD process, then crush rm / osd rm / auth del the OSD shortly
> > afterwards,
>
> Huh!  I am using a replication = 2, so I'd be worried about the other drive
> dying before a replication can occur.
>
> For my on the edge cluster, it seems safer to mark the OSD out, then remove
> from CRUSH, then turn off the OSD daemon.
>
> Looks like when an OSD is marked out, reweight is set to 0.  Is this the
> same
> as weight being set to 0?  I assume in either case the data is still
> available
> to be replicated elsewhere.
>
> If one removes an OSD from CRUSH but not turn off the OSD, is the data
> available for replication.  (I would guess "no".)


Actually, that's the point -- if you mark an OSD out (which you are
implicitly doing by removing it from the crush map) but it is not down, the
new host OSDs will use it as a data source. Of course, you'll need to wait
to run "OSD rm" until the data movement is done for that. :)
-Greg



>
> > The main thing to note is that crush rm of an out or DNE OSD will trigger
> > backfilling, even though intuitively that shouldn't require any data
> > movement. This was confirmed by the developers as a sort of side effect
> of
> > the current CRUSH implementation.
>
> I guess changing the CRUSH does not preserve current data locations (like a
> non-stable sorting algorithm).
>
> Thanks!
> Chad.
>
>
>
> >
> > Cheers, Dan
> >
> > On Apr 3, 2014 4:00 AM, Chad William Seys 
> > <[email protected]<javascript:;>>
> wrote:
> > Hi All,
> >   Slide 19 of Ceph at CERN presentation
> > http://www.slideshare.net/Inktank_Ceph/scaling-ceph-at-cern
> > says that when removing an OSD from Ceph it is faster to
> > just "ceph osd crush rm <osd id>" rather than marking the
> > osd as "out", waiting for data migration, and then "rm" the
> > OSD.
> >   The reason they give is that "out then rm" leads to two modifications
> > to CRUSH and two data migrations, which takes more time.
> >   I have observed this to be true!
> >
> >   However, is it safer to do the "out then rm"?  Doesn't just doing an
> "rm"
> > make replicas unavailable?
> >
> > (BTW, they used replica = 4, so maybe they were less concerned!)
> >
> > Thanks!
> > Chad.
> > _______________________________________________
> > ceph-users mailing list
> > [email protected] <javascript:;>
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> [email protected] <javascript:;>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>


-- 
Software Engineer #42 @ http://inktank.com | http://ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to