Re: [ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?

2018-02-13 Thread David Turner
Step 1.5 wait until the cluster is done backfilling. CRUSH is an algorithm. By changing the weight of any object it gives every PG in the entire cluster a chance to move. This is because CRUSH is deterministic and why every client can find any inject on any osd. The amount of data that moves is

Re: [ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?

2018-02-13 Thread Christian Sarrasin
Thanks! I'm still puzzled as to _what_ data is moving if the OSD was previously "out" and didn't host any PG (according to pg dump). The host only had one other OSD which was already "out" and had zero weight. It looks like Ceph is moving some other data, which wasn't hosted on the re-weighted

Re: [ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?

2018-02-13 Thread David Turner
An out osd still has a crush weight. Removing that osd or weighting it to 0 will change the weight of the host that it's in. That is why data moves again. There is a thread in the ML started by Sage about possible ways to confront the double data shift when drives fail. Data moving of when it goes

Re: [ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?

2018-02-13 Thread Christian Sarrasin
Thanks for your input John! This doesn't really match the doc [1], which suggests just taking them out and only using "reweight" in case of issues (with small clusters). Is "reweight" considered a must before removing and OSD? Cheers On 13/02/18 12:34, John Petrini wrote: > The rule of thumb

Re: [ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?

2018-02-13 Thread Christian Sarrasin
Thanks for your input John! This doesn't really match the doc [1], which suggests just taking them out and only using "reweight" in case of issues (with small clusters). Is "reweight" considered a must before removing and OSD? Cheers On 13/02/18 12:34, John Petrini wrote: > The rule of thumb

Re: [ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?

2018-02-13 Thread John Petrini
The rule of thumb is to reweight to 0 prior to marking out. This should avoid causing data movement twice as you're experiencing. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] osd crush reweight 0 on "out" OSD causes backfilling?

2018-02-13 Thread Christian Sarrasin
Hi all, I'm in the process of decommissioning some OSDs and thought I'd previously migrated all data off them by marking them "out" (which did trigger a fair amount of remapping as expected). Looking at the pgmap ('ceph pg dump') confirmed that none of the "out" OSDs was hosting any more PGs