On 5/4/20 10:48 AM, Francois Legrand wrote:
> Hi all,
> During a crash disaster we had destroyed and reinstalled with a
> different number a few osds.
> As an example osd 3 was destroyed and recreated with id 101 by command
> ceph osd purge 3 --yes-i-really-mean-it + ceph osd create (to block id
> 3) + ceph-deploy osd create --data /dev/sdxx <server> and finally ceph
> osd rm 3).
> Some of our pgs are now incomplet (which can be understood) but blocked
> by some of the removed osd :
> ex: here is an part of the ceph pg 30.3 query
> {
> "state": "incomplete",
> "snap_trimq": "[]",
> "snap_trimq_len": 0,
> "epoch": 384075,
> "up": [
> 103,
> 43,
> 29,
> 2,
> 66
> ],
> "acting": [
> 103,
> 43,
> 29,
> 2,
> 66
> ],
> ........
>
> "peer_info": [
> {
> "peer": "2(3)",
> "pgid": "30.3s3",
> "last_update": "373570'105925965",
> "last_complete": "373570'105925965",
> .......
> },
> "up": [
> 103,
> 43,
> 29,
> 2,
> 66
> ],
> "acting": [
> 103,
> 43,
> 29,
> 2,
> 66
> ],
> "avail_no_missing": [],
> "object_location_counts": [],
> *"blocked_by": [**
> ** 3,**
> ** 49**
> ** ],*
> ............
> "down_osds_we_would_probe": [
> *3*
> ],
> "peering_blocked_by": [],
> "peering_blocked_by_detail": [
> {
> * "detail": "peering_blocked_by_history_les_bound"*
> }
> ]
>
>
> I don't understand why the removed osd are still considered and present
> in the pg infos.
> Is there a way to get rid of that ?
You can try to set:
osd_find_best_info_ignore_history_les = true
Then restart the OSDs involved with that PG.
Wido
> Moreover, we have tons of slow ops (more than 15 000) but I guess that
> the two problems are linked.
> Thanks for your help.
> F.
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]