On Tuesday, October 21, 2014, Chad Seys <[email protected]> wrote:

> Hi Craig,
>
> > It's part of the way the CRUSH hashing works.  Any change to the CRUSH
> map
> > causes the algorithm to change slightly.
>
> Dan@cern could not replicate my observations, so I plan to follow his
> procedure (fake create an OSD, wait for rebalance, remove fake OSD) in the
> near future to see if I can replicate his! :)
>
>
> > BTW, it's safer to remove OSDs and hosts by first marking the OSDs UP and
> > OUT (ceph osd out OSDID).  That will trigger the remapping, while keeping
> > the OSDs in the pool so you have all of your replicas.
>
> I am under the impression that the procedure I posted does leave the OSDs
> in
> the pool while an additional replication takes place: After "ceph osd crush
> remove osd.osdnum" I see that the used % on the removed OSD slowly
> decreases
> as the relocation of blocks takes place.
>
> If my ceph-fu were strong enough I would try to find some block replicated
> num_replicas+1 times so that my belief would be well-founded. :)
>
> Also "ceph osd crush remove osd.osdnum" still shows the OSD in "ceph osd
> tree", but it is not attached to any server.  I think it might even be
> marked
> UP and DOWN, but I cannot confirm.
>
> So I believe so far the approaches are equivalent.
>
> BUT, I think that to keep an OSD out after using "ceph osd out OSDID" one
> needs to turn off "auto in" or something.
>
> I don't want to turn that off b/c in the past I had some slow drives which
> would occasionally be marked "out".  If they stayed "out" that could
> increase
> load on other drives, making them unresponsive, getting them marked "out"
> as
> well, leading to a domino effect where too many drives get marked "out" and
> the cluster goes down.
>
> Now I have better hardware, but since the scenario exists, I'd rather avoid
> it! :)


There are separate options for automatically marking new drives in versus
marking in established ones. Should be in the docs! :)
-Greg



>
>
> > If you mark the OSDs OUT, wait for the remapping to finish, and remove
> the
> > OSDs and host from the CRUSH map, there will still be some data
> migration.
>
> Yep, this is what I see.  But I find it weird.
>
> >
> >
> > Ceph is also really good at handling multiple changes in a row.  For
> > example, I had to reformat all of my OSDs because I chose my mkfs.xfs
> > parameters poorly.   I removed the OSDS, without draining them first,
> which
> > caused a lot of remapping.  I then quickly formatted the OSDs, and put
> them
> > back in.  The CRUSH map went back to what it started with, and the only
> > remapping required was to re-populate the newly formatted OSDs.
>
> In this case you'd be living with num_replicas-1 for a while.  Sounds
> exciting!  :)
>
> Thanks,
> Chad.
> _______________________________________________
> ceph-users mailing list
> [email protected] <javascript:;>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>


-- 
Software Engineer #42 @ http://inktank.com | http://ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to