[ceph-users] Re: EC PGs stuck activating, 2^31-1 as OSD ID, automatic recovery not kicking in

2019-11-27 Thread Aleksey Gutikov

On 22.11.19 23:45, Paul Emmerich wrote:

tools), it means no mapping could be found; check your crush map and
crush rule


Most simple way to get into this state is to change OSDs' reweight on 
small cluster where number of OSDs equal to EC n+k.
I do not know exactly, but seems that straw2 crush algorithm has no 
default/fallback policy to pick osd for pg.


For example if you have 8 osd, failure_domain=osd and ec profile with 
k=5,m=3 and then you eventually change reweight of some osd then crush 
algorithm fails to satisfy both requirements:

- failure domain
- number of pgs on osd be proportional to weight

Btw I've newer saw any explanation of decision not to pick OSD in this case.
I mean why not to pick OSD and leave PG in degraded state is better then 
violate weight requirements?

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: EC PGs stuck activating, 2^31-1 as OSD ID, automatic recovery not kicking in

2019-11-24 Thread huang jun
How many PGs in the pools?
This maybe the CRUSH can not get the proper OSD.
you can check param  tunable choose_total_tries in crush tunables,
try increase it like this:

ceph osd getcrushmap -o crush
crushtool -d crush -o crush.txt
sed -i 's/tunable choose_total_tries 50/tunable choose_total_tries
150/g' crush.txt
crushtool -c crush.txt -o crush.new
ceph osd setcrushmap -i crush.new

Zoltan Arnold Nagy  于2019年11月23日周六 上午5:26写道:
>
> On 2019-11-22 21:45, Paul Emmerich wrote:
> > On Fri, Nov 22, 2019 at 9:33 PM Zoltan Arnold Nagy
> >  wrote:
> >
> >> The 2^31-1 in there seems to indicate an overflow somewhere - the way
> >> we
> >> were able to figure out where exactly
> >> is to query the PG and compare the "up" and "acting" sets - only _one_
> >> of them had the 2^31-1 number in place
> >> of the correct OSD number. We restarted that and the PG started doing
> >> its job and recovered.
> >
> > no, this value is intentional (and shows up as 'None' on higher level
> > tools), it means no mapping could be found
>
> thanks, didn't know.
>
> > check your crush map and crush rule
>
> if it were indeed a crush rule or map issue, it would not have been
> resolved by just restarting the primary OSD of the PG, would it?
>
> the crush rule was created by running
>
> ceph osd erasure-code-profile set ec42 k=4 m=2 crush-device-class=nvme
>
> where the default failure domain is host; as I said we have 12 hosts,
> so I don't see anything wrong here - it's all default...
>
> this is why I suspect a bug, just don't have any evidence other than
> that it happened to us :)
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: EC PGs stuck activating, 2^31-1 as OSD ID, automatic recovery not kicking in

2019-11-22 Thread Paul Emmerich
On Fri, Nov 22, 2019 at 9:33 PM Zoltan Arnold Nagy
 wrote:

> The 2^31-1 in there seems to indicate an overflow somewhere - the way we
> were able to figure out where exactly
> is to query the PG and compare the "up" and "acting" sets - only _one_
> of them had the 2^31-1 number in place
> of the correct OSD number. We restarted that and the PG started doing
> its job and recovered.

no, this value is intentional (and shows up as 'None' on higher level
tools), it means no mapping could be found; check your crush map and
crush rule

Paul


>
> The issue seems to be going back to 2015:
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-May/001661.html
> however no solution...
>
> I'm more concerned about the cluster not being able to recover (it's a
> 4+2 EC pool across 12 hosts - plenty of room
> to heal) than about the weird print-out.
>
> The VMs who wanted to access data in any of the affected PGs of course
> died.
>
> Are we missing some settings to let the cluster self-heal even for EC
> pools? First EC pool in production :)
>
> Cheers,
> Zoltan
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io