Re: [ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?
Step 1.5 wait until the cluster is done backfilling. CRUSH is an algorithm. By changing the weight of any object it gives every PG in the entire cluster a chance to move. This is because CRUSH is deterministic and why every client can find any inject on any osd. The amount of data that moves is relative to how much the weights have changed in the CRUSH map, but any PG can be moved regardless of which weights actually changed. On Tue, Feb 13, 2018, 2:55 PM Christian Sarrasinwrote: > Thanks! I'm still puzzled as to _what_ data is moving if the OSD was > previously "out" and didn't host any PG (according to pg dump). The > host only had one other OSD which was already "out" and had zero weight. > It looks like Ceph is moving some other data, which wasn't hosted on > the re-weighted OSD. > > Just to reiterate my q: from what I'm reading here, it sounds like the > best practice to remove an OSD from the cluster is to run: > > 1. ceph osd crush reweight osd.X 0 > 2. ceph osd out osd.X > > > The official doc [1] suggests doing just #2. > > [1] > > http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#removing-osds-manual > > On 13/02/18 14:38, David Turner wrote: > > An out osd still has a crush weight. Removing that osd or weighting it > > to 0 will change the weight of the host that it's in. That is why data > > moves again. There is a thread in the ML started by Sage about possible > > ways to confront the double data shift when drives fail. Data moving of > > when it goes out and then again when it is removed from the cluster. > > > > If the drive was still readable when it was marked out, the best method > > is to weight it to 0 while it is still running so it can be used to > > offload its data. Also in this method, when you remove it from the > > cluster, there will not be any additional data movement. > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?
Thanks! I'm still puzzled as to _what_ data is moving if the OSD was previously "out" and didn't host any PG (according to pg dump). The host only had one other OSD which was already "out" and had zero weight. It looks like Ceph is moving some other data, which wasn't hosted on the re-weighted OSD. Just to reiterate my q: from what I'm reading here, it sounds like the best practice to remove an OSD from the cluster is to run: 1. ceph osd crush reweight osd.X 0 2. ceph osd out osd.X The official doc [1] suggests doing just #2. [1] http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#removing-osds-manual On 13/02/18 14:38, David Turner wrote: > An out osd still has a crush weight. Removing that osd or weighting it > to 0 will change the weight of the host that it's in. That is why data > moves again. There is a thread in the ML started by Sage about possible > ways to confront the double data shift when drives fail. Data moving of > when it goes out and then again when it is removed from the cluster. > > If the drive was still readable when it was marked out, the best method > is to weight it to 0 while it is still running so it can be used to > offload its data. Also in this method, when you remove it from the > cluster, there will not be any additional data movement. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?
An out osd still has a crush weight. Removing that osd or weighting it to 0 will change the weight of the host that it's in. That is why data moves again. There is a thread in the ML started by Sage about possible ways to confront the double data shift when drives fail. Data moving of when it goes out and then again when it is removed from the cluster. If the drive was still readable when it was marked out, the best method is to weight it to 0 while it is still running so it can be used to offload its data. Also in this method, when you remove it from the cluster, there will not be any additional data movement. On Tue, Feb 13, 2018, 6:55 AM Christian Sarrasinwrote: > Thanks for your input John! This doesn't really match the doc [1], > which suggests just taking them out and only using "reweight" in case of > issues (with small clusters). > > Is "reweight" considered a must before removing and OSD? > > Cheers > > On 13/02/18 12:34, John Petrini wrote: > > The rule of thumb is to reweight to 0 prior to marking out. This should > > avoid causing data movement twice as you're experiencing. > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?
Thanks for your input John! This doesn't really match the doc [1], which suggests just taking them out and only using "reweight" in case of issues (with small clusters). Is "reweight" considered a must before removing and OSD? Cheers On 13/02/18 12:34, John Petrini wrote: > The rule of thumb is to reweight to 0 prior to marking out. This should > avoid causing data movement twice as you're experiencing. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?
Thanks for your input John! This doesn't really match the doc [1], which suggests just taking them out and only using "reweight" in case of issues (with small clusters). Is "reweight" considered a must before removing and OSD? Cheers On 13/02/18 12:34, John Petrini wrote: > The rule of thumb is to reweight to 0 prior to marking out. This should > avoid causing data movement twice as you're experiencing. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?
The rule of thumb is to reweight to 0 prior to marking out. This should avoid causing data movement twice as you're experiencing. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?
Hi all, I'm in the process of decommissioning some OSDs and thought I'd previously migrated all data off them by marking them "out" (which did trigger a fair amount of remapping as expected). Looking at the pgmap ('ceph pg dump') confirmed that none of the "out" OSDs was hosting any more PGs (columns 'up' and 'acting'). I thought my next prudent step before taking the OSDs down and removing them from the crushmap was to reweight them to 0. To my surprise this caused a flurry of remapping/backfilling. Is this expected and if so, what am I missing? This is an old Firefly cluster (purpose of taking out the OSDs is to repurpose them into a Luminous cluster we're building...). Assuming this has no useful function, would I be good to take them out using 'osd crush remove' then 'osd rm' without re-weighting them to 0 first? Your insight is much appreciated! Cheers Christian ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com