Hi Sage,

thank you for your help.


My origin issue with slow ops on osd restarts are gone too. Even with default 
values for paxos_proposal_interval.


Its a bit annoying, that I spent many hours to debug this and finally I missed 
only one step in the upgrade.

Only during the update itself, until require_osd_release is set to the new 
version, there will be interruptions



Regards
Manuel


________________________________
Von: Sage Weil <[email protected]>
Gesendet: Dienstag, 9. November 2021 17:29
An: Manuel Lausch
Betreff: Re: [ceph-users] Re: OSD spend too much time on "waiting for readable" 
-> slow ops -> laggy pg -> rgw stop -> worst case osd restart

Yeah, I think that is the problem.  The field that is getting updated by 
prepare_beacon is new in octopus, so if your osdmap still has 
require_osd_release=nautlius then it is trying to set it but then not getting 
encoded (for compatibility).  Doing `ceph osd require_osd_release octopus` 
should resolve this.

On Tue, Nov 9, 2021 at 9:01 AM Sage Weil 
<[email protected]<mailto:[email protected]>> wrote:
What version are you running?  I thought it was pacific or octopus but the 
osdmap says "require_osd_release": "nautilus" which implies the upgrade 
procedure wasn't finished?

sage

On Tue, Nov 9, 2021 at 8:08 AM Manuel Lausch 
<[email protected]<mailto:[email protected]>> wrote:
As far as I see, the maps differ only in the epoch and creation date.
Nothing else. I dumped some maps and uploaded it for you:
1f1e1e5e-1c1c-470b-b691-ed820687bab8

On This cluster I don't create snapshots regularly. Since some weeks,
there are no snapshots present.

please let me know, if you need further information.

Regards
Manuel


On Tue, 9 Nov 2021 07:40:29 -0600
Sage Weil <[email protected]<mailto:[email protected]>> wrote:

> Are you sure consecutive maps are identical?  Can you get the latest
> epoch ('ceph osd stat'), and then dump a few consecutive ones?  e.g.
>
> ceph osd dump 1000 -f json-pretty  > 1000
> ceph osd dump 1001 -f json-pretty  > 1001
> ceph osd dump 1002 -f json-pretty  > 1002
> ceph osd dump 1003 -f json-pretty  > 1003
>
> ...and ceph-post-file those?  Based on the logs I think the delta is
> related to snap trimming, but want to confirm.  Thanks!
>
> Thanks!
> sage
>
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to