[ceph-users] Re: pg incomplete blocked by destroyed osd

Wido den Hollander Mon, 04 May 2020 04:58:22 -0700


On 5/4/20 10:48 AM, Francois Legrand wrote:
> Hi all,
> During a crash disaster we had destroyed and reinstalled with a
> different number a few osds.
> As an example osd 3 was destroyed and recreated with id 101 by command
> ceph osd purge 3 --yes-i-really-mean-it + ceph osd create (to block id
> 3) + ceph-deploy osd create --data /dev/sdxx <server> and finally ceph
> osd rm 3).
> Some of our pgs are now incomplet (which can be understood) but blocked
> by some of the removed osd :
> ex: here is an part of the ceph pg 30.3 query
> {
>     "state": "incomplete",
>     "snap_trimq": "[]",
>     "snap_trimq_len": 0,
>     "epoch": 384075,
>     "up": [
>         103,
>         43,
>         29,
>         2,
>         66
>     ],
>     "acting": [
>         103,
>         43,
>         29,
>         2,
>         66
>     ],
> ........
> 
> "peer_info": [
>         {
>             "peer": "2(3)",
>             "pgid": "30.3s3",
>             "last_update": "373570'105925965",
>             "last_complete": "373570'105925965",
> .......
> },
>                 "up": [
>                     103,
>                     43,
>                     29,
>                     2,
>                     66
>                 ],
>                 "acting": [
>                     103,
>                     43,
>                     29,
>                     2,
>                     66
>                 ],
>                 "avail_no_missing": [],
>                 "object_location_counts": [],
> *"blocked_by": [**
> **                    3,**
> **                    49**
> **                ],*
> ............
>             "down_osds_we_would_probe": [
> *3*
>             ],
>             "peering_blocked_by": [],
>             "peering_blocked_by_detail": [
>                 {
> *                    "detail": "peering_blocked_by_history_les_bound"*
>                 }
>             ]
> 
> 
> I don't understand why the removed osd are still considered and present
> in the pg infos.
> Is there a way to get rid of that ?


You can try to set:

osd_find_best_info_ignore_history_les = true

Then restart the OSDs involved with that PG.

Wido

> Moreover, we have tons of slow ops (more than 15 000) but I guess that
> the two problems are linked.
> Thanks for your help.
> F.
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Re: pg incomplete blocked by destroyed osd

Reply via email to