Setting an osd to 0.0 in the crush map will tell all PGs to move off of the
osd. It's right the same as removing the osd from the closer, except it
allows the osd to help move the data that it has and prevents having
degraded PGs and objects while you do it. The limit to weighting osds to
0.0 is how full your cluster and remaining osds will be when the 0.0 osds
are empty.

On Tue, Jun 20, 2017, 10:29 AM Peter Maloney <
peter.malo...@brockmann-consult.de> wrote:

> these settings are on a specific OSD:
>
> osd recovery max active = 1
> osd max backfills = 1
>
>
> I don't know if it will behave as you expect if you set 0... (I tested
> setting 0 which didn't complain, but is 0 actually 0 or unlimited or an
> error?)
>
> Maybe you could parse the ceph pg dump, then look at the pgs that list
> your special osds, then set all of the listed osds (not just special ones)
> config to 1 and the rest 0. But this will not prioritize specific pgs... or
> even specific osds, and maybe it'll end up being all osds.
>
> To further add to your criteria, you could select ones where the direction
> of movement is how you want it... like if up (where CRUSH wants the data
> after recovery is done) says [1,2,3] and acting (where it is now, even
> partial pgs I think) says [1,2,7] and you want to empty 7, then you have to
> set the numbers non-zero for osd 3 and 7, but maybe not 1 or 2 (although
> these could be read as part of recovery).
>
> I'm sure it's doomed to fail, but you can try it out on a test cluster.
>
> My guess is it will either not accept 0 like you expect, or it will only
> be a small fraction of your osds that you can set to 0.
>
>
>
> On 06/20/17 14:44, Richard Hesketh wrote:
>
> Is there a way, either by individual PG or by OSD, I can prioritise 
> backfill/recovery on a set of PGs which are currently particularly important 
> to me?
>
> For context, I am replacing disks in a 5-node Jewel cluster, on a 
> node-by-node basis - mark out the OSDs on a node, wait for them to clear, 
> replace OSDs, bring up and in, mark out the OSDs on the next set, etc. I've 
> done my first node, but the significant CRUSH map changes means most of my 
> data is moving. I only currently care about the PGs on my next set of OSDs to 
> replace - the other remapped PGs I don't care about settling because they're 
> only going to end up moving around again after I do the next set of disks. I 
> do want the PGs specifically on the OSDs I am about to replace to backfill 
> because I don't want to compromise data integrity by downing them while they 
> host active PGs. If I could specifically prioritise the backfill on those 
> PGs/OSDs, I could get on with replacing disks without worrying about causing 
> degraded PGs.
>
> I'm in a situation right now where there is merely a couple of dozen PGs on 
> the disks I want to replace, which are all remapped and waiting to backfill - 
> but there are 2200 other PGs also waiting to backfill because they've moved 
> around too, and it's extremely frustating to be sat waiting to see when the 
> ones I care about will finally be handled so I can get on with replacing 
> those disks.
>
> Rich
>
>
>
>
> _______________________________________________
> ceph-users mailing 
> listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> --
>
> --------------------------------------------
> Peter Maloney
> Brockmann Consult
> Max-Planck-Str. 2
> 21502 Geesthacht
> Germany
> Tel: +49 4152 889 300
> Fax: +49 4152 889 333
> E-mail: peter.malo...@brockmann-consult.de
> Internet: http://www.brockmann-consult.de
> --------------------------------------------
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to