Re: [ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?

2018-02-13 Thread David Turner
Step 1.5 wait until the cluster is done backfilling.

CRUSH is an algorithm. By changing the weight of any object it gives every
PG in the entire cluster a chance to move. This is because CRUSH is
deterministic and why every client can find any inject on any osd. The
amount of data that moves is relative to how much the weights have changed
in the CRUSH map, but any PG can be moved regardless of which weights
actually changed.

On Tue, Feb 13, 2018, 2:55 PM Christian Sarrasin 
wrote:

> Thanks!  I'm still puzzled as to _what_ data is moving if the OSD was
> previously "out" and didn't host any PG (according to pg dump).  The
> host only had one other OSD which was already "out" and had zero weight.
>  It looks like Ceph is moving some other data, which wasn't hosted on
> the re-weighted OSD.
>
> Just to reiterate my q: from what I'm reading here, it sounds like the
> best practice to remove an OSD from the cluster is to run:
>
> 1. ceph osd crush reweight osd.X 0
> 2. ceph osd out osd.X
> 
>
> The official doc [1] suggests doing just #2.
>
> [1]
>
> http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#removing-osds-manual
>
> On 13/02/18 14:38, David Turner wrote:
> > An out osd still has a crush weight. Removing that osd or weighting it
> > to 0 will change the weight of the host that it's in. That is why data
> > moves again. There is a thread in the ML started by Sage about possible
> > ways to confront the double data shift when drives fail. Data moving of
> > when it goes out and then again when it is removed from the cluster.
> >
> > If the drive was still readable when it was marked out, the best method
> > is to weight it to 0 while it is still running so it can be used to
> > offload its data. Also in this method, when you remove it from the
> > cluster, there will not be any additional data movement.
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?

2018-02-13 Thread Christian Sarrasin
Thanks!  I'm still puzzled as to _what_ data is moving if the OSD was
previously "out" and didn't host any PG (according to pg dump).  The
host only had one other OSD which was already "out" and had zero weight.
 It looks like Ceph is moving some other data, which wasn't hosted on
the re-weighted OSD.

Just to reiterate my q: from what I'm reading here, it sounds like the
best practice to remove an OSD from the cluster is to run:

1. ceph osd crush reweight osd.X 0
2. ceph osd out osd.X


The official doc [1] suggests doing just #2.

[1]
http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#removing-osds-manual

On 13/02/18 14:38, David Turner wrote:
> An out osd still has a crush weight. Removing that osd or weighting it
> to 0 will change the weight of the host that it's in. That is why data
> moves again. There is a thread in the ML started by Sage about possible
> ways to confront the double data shift when drives fail. Data moving of
> when it goes out and then again when it is removed from the cluster.
> 
> If the drive was still readable when it was marked out, the best method
> is to weight it to 0 while it is still running so it can be used to
> offload its data. Also in this method, when you remove it from the
> cluster, there will not be any additional data movement.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?

2018-02-13 Thread David Turner
An out osd still has a crush weight. Removing that osd or weighting it to 0
will change the weight of the host that it's in. That is why data moves
again. There is a thread in the ML started by Sage about possible ways to
confront the double data shift when drives fail. Data moving of when it
goes out and then again when it is removed from the cluster.

If the drive was still readable when it was marked out, the best method is
to weight it to 0 while it is still running so it can be used to offload
its data. Also in this method, when you remove it from the cluster, there
will not be any additional data movement.

On Tue, Feb 13, 2018, 6:55 AM Christian Sarrasin 
wrote:

> Thanks for your input John!  This doesn't really match the doc [1],
> which suggests just taking them out and only using "reweight" in case of
> issues (with small clusters).
>
> Is "reweight" considered a must before removing and OSD?
>
> Cheers
>
> On 13/02/18 12:34, John Petrini wrote:
> > The rule of thumb is to reweight to 0 prior to marking out. This should
> > avoid causing data movement twice as you're experiencing.
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?

2018-02-13 Thread Christian Sarrasin
Thanks for your input John!  This doesn't really match the doc [1],
which suggests just taking them out and only using "reweight" in case of
issues (with small clusters).

Is "reweight" considered a must before removing and OSD?

Cheers

On 13/02/18 12:34, John Petrini wrote:
> The rule of thumb is to reweight to 0 prior to marking out. This should
> avoid causing data movement twice as you're experiencing.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?

2018-02-13 Thread Christian Sarrasin
Thanks for your input John!  This doesn't really match the doc [1],
which suggests just taking them out and only using "reweight" in case of
issues (with small clusters).

Is "reweight" considered a must before removing and OSD?

Cheers

On 13/02/18 12:34, John Petrini wrote:
> The rule of thumb is to reweight to 0 prior to marking out. This should
> avoid causing data movement twice as you're experiencing.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?

2018-02-13 Thread John Petrini
The rule of thumb is to reweight to 0 prior to marking out. This should
avoid causing data movement twice as you're experiencing.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?

2018-02-13 Thread Christian Sarrasin
Hi all,

I'm in the process of decommissioning some OSDs and thought I'd
previously migrated all data off them by marking them "out" (which did
trigger a fair amount of remapping as expected).

Looking at the pgmap ('ceph pg dump') confirmed that none of the "out"
OSDs was hosting any more PGs (columns 'up' and 'acting').

I thought my next prudent step before taking the OSDs down and removing
them from the crushmap was to reweight them to 0.  To my surprise this
caused a flurry of remapping/backfilling.

Is this expected and if so, what am I missing?  This is an old Firefly
cluster (purpose of taking out the OSDs is to repurpose them into a
Luminous cluster we're building...).

Assuming this has no useful function, would I be good to take them out
using 'osd crush remove' then 'osd rm' without re-weighting them to 0 first?

Your insight is much appreciated!

Cheers
Christian

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com