Re: [ceph-users] stuck with active+undersized+degraded on Jewel after cluster maintenance

2018-08-03 Thread Pawel S
On Fri, Aug 3, 2018 at 2:07 PM Paweł Sadowsk  wrote:

> On 08/03/2018 01:45 PM, Pawel S wrote:
> > hello!
> >
> > We did maintenance works (cluster shrinking) on one cluster (jewel)
> > and after shutting one of osds down we found this situation where
> > recover of pg can't be started because of "querying" one of peers. We
> > restarted this OSD, tried to out and in. Nothing helped, finally we
> > moved out data (the pg was still on it) and removed this osd from
> > crush and whole cluster. But recover can't start on any other osd to
> > create this copy again. We still have valid active 2 copies, but we
> > would like to have it clean.
> > How we can push recover to have this third copy somewhere ?
> > Replication size is 3 on hosts and there are plenty of them.
> >
> > Status now:
> >health HEALTH_WARN
> > 1 pgs degraded
> > 1 pgs stuck degraded
> > 1 pgs stuck unclean
> > 1 pgs stuck undersized
> > 1 pgs undersized
> > recovery 268/19265130 objects degraded (0.001%)
> >
> > Link to PG query details, health status and version commit here:
> > https://gist.github.com/pejotes/aea71ecd2718dbb3ceab0e648924d06b
> Can you add 'ceph osd tree', 'ceph osd crush show-tunables' and 'ceph
> osd crush rule dump'? Looks like crush is not able to find place for 3rd
> copy due to big difference in weight of rack/host depending on your
> crush rules.
>
>
yes, you were right :-)

Quickly went through the alg and found it's simply don't have enough tries
(as a workaround) to handle this weight difference (I had 54, 115, 145) in
my failure domains. Increasing "choose_total_tries" to 100 did the trick.
Rules were set to choose on datacenter buckets created from racks and
hosts. Next step will be to balance weight of datacenter buckets to
equalize it a bit, couple of OSDs can be removed. :-)
Thank you Pawel!

best regards!
Pawel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] stuck with active+undersized+degraded on Jewel after cluster maintenance

2018-08-03 Thread Paweł Sadowsk
On 08/03/2018 01:45 PM, Pawel S wrote:
> hello!
>
> We did maintenance works (cluster shrinking) on one cluster (jewel)
> and after shutting one of osds down we found this situation where
> recover of pg can't be started because of "querying" one of peers. We
> restarted this OSD, tried to out and in. Nothing helped, finally we
> moved out data (the pg was still on it) and removed this osd from
> crush and whole cluster. But recover can't start on any other osd to
> create this copy again. We still have valid active 2 copies, but we
> would like to have it clean. 
> How we can push recover to have this third copy somewhere ?
> Replication size is 3 on hosts and there are plenty of them.  
>
> Status now: 
>    health HEALTH_WARN
>             1 pgs degraded
>             1 pgs stuck degraded
>             1 pgs stuck unclean
>             1 pgs stuck undersized
>             1 pgs undersized
>             recovery 268/19265130 objects degraded (0.001%)
>
> Link to PG query details, health status and version commit here:
> https://gist.github.com/pejotes/aea71ecd2718dbb3ceab0e648924d06b
Can you add 'ceph osd tree', 'ceph osd crush show-tunables' and 'ceph
osd crush rule dump'? Looks like crush is not able to find place for 3rd
copy due to big difference in weight of rack/host depending on your
crush rules.

-- 
PS
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] stuck with active+undersized+degraded on Jewel after cluster maintenance

2018-08-03 Thread Pawel S
hello!

We did maintenance works (cluster shrinking) on one cluster (jewel) and
after shutting one of osds down we found this situation where recover of pg
can't be started because of "querying" one of peers. We restarted this OSD,
tried to out and in. Nothing helped, finally we moved out data (the pg was
still on it) and removed this osd from crush and whole cluster. But recover
can't start on any other osd to create this copy again. We still have valid
active 2 copies, but we would like to have it clean.
How we can push recover to have this third copy somewhere ? Replication
size is 3 on hosts and there are plenty of them.

Status now:
   health HEALTH_WARN
1 pgs degraded
1 pgs stuck degraded
1 pgs stuck unclean
1 pgs stuck undersized
1 pgs undersized
recovery 268/19265130 objects degraded (0.001%)

Link to PG query details, health status and version commit here:
https://gist.github.com/pejotes/aea71ecd2718dbb3ceab0e648924d06b

best regards!
-- 
Pawel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com