Thank you for pointing this out. I did check my cluster by using the article 
given command, it over 17 million PG dups over each OSDs.
May I know if the snaptrim activity takes place every six hours?  If I disable 
the snaptrim, will it stop the slow ops temporarily before my performing 
version upgrade?
If I want to upgrade my Ceph, it will take time to analysis the environment. 
Can I have work around quickly for delete OSD then create it again for zeroized 
the log times? or manually delete the OSD log?



________________________________
From: Boris <b...@kervyn.de>
Sent: Wednesday, December 6, 2023 10:13
To: Peter <peter...@raksmart.com>
Cc: ceph-users@ceph.io <ceph-users@ceph.io>
Subject: Re: [ceph-users] Assistance Needed with Ceph Cluster Slow Ops Issue

Hi Peter,

try to set the cluster to nosnaptrim

If this helps, you might need to upgrade to pacific, because you are hit by the 
pg dups bug.

See: https://www.clyso.com/blog/how-to-identify-osds-affected-by-pg-dup-bug/


Mit freundlichen Grüßen
 - Boris Behrens

Am 06.12.2023 um 19:01 schrieb Peter <peter...@raksmart.com>:

Dear all,


I am reaching out regarding an issue with our Ceph cluster that has been 
recurring every six hours. Upon investigating the problem using the "ceph 
daemon dump_historic_slow_ops" command, I observed that the issue appears to be 
related to slow operations, specifically getting stuck at "Waiting for RW 
Locks." The wait times often range from one to two seconds.

Our cluster use SAS SSD disks from Samsung for the storage pool in question. 
While these disks are of high quality and should provide sufficient speed, the 
problem persists. The slow ops occurrence is consistent every six hours.

I would greatly appreciate any insights or suggestions you may have to address 
and resolve this issue. If there are specific optimizations or configurations 
that could improve the situation, please advise.


below are some output:

root@lasas003:~# ceph -v
ceph version 15.2.17 (542df8d06ef24dbddcf4994db16bcc4c89c9ec2d) octopus (stable)


"events": [

                   {
                       "event": "initiated",
                       "time": "2023-12-06T08:34:18.501644-0800",
                       "duration": 0
                   },
                   {
                       "event": "throttled",
                       "time": "2023-12-06T08:34:18.501644-0800",
                       "duration": 3.067e-06
                   },
                   {
                       "event": "header_read",
                       "time": "2023-12-06T08:34:18.501647-0800",
                       "duration": 3.5429999999999998e-06
                   },
                   {
                       "event": "all_read",
                       "time": "2023-12-06T08:34:18.501650-0800",
                       "duration": 9.3399999999999997e-07
                   },
                   {
                       "event": "dispatched",
                       "time": "2023-12-06T08:34:18.501651-0800",
                       "duration": 3.2830000000000002e-06
                   },
                   {
                       "event": "queued_for_pg",
                       "time": "2023-12-06T08:34:18.501654-0800",
                       "duration": 1.3819939990000001
                   },
                   {
                       "event": "reached_pg",
                       "time": "2023-12-06T08:34:19.883648-0800",
                       "duration": 5.7980000000000002e-06
                   },
                   {
                       "event": "waiting for rw locks",
                       "time": "2023-12-06T08:34:19.883654-0800",
                       "duration": 4.2484711649999998
                   },
                   {
                       "event": "reached_pg",
                       "time": "2023-12-06T08:34:24.132125-0800",
                       "duration": 1.0667e-05
                   },
                   {
                       "event": "waiting for rw locks",
                       "time": "2023-12-06T08:34:24.132136-0800",
                       "duration": 2.1593527840000002
                   },
                   {
                       "event": "reached_pg",
                       "time": "2023-12-06T08:34:26.291489-0800",
                       "duration": 3.292e-06
                   },
                   {
                       "event": "waiting for rw locks",
                       "time": "2023-12-06T08:34:26.291492-0800",
                       "duration": 0.43918164700000001
                   },
                   {
                       "event": "reached_pg",
                       "time": "2023-12-06T08:34:26.730674-0800",
                       "duration": 5.1529999999999996e-06
                   },
                   {
                       "event": "waiting for rw locks",
                       "time": "2023-12-06T08:34:26.730679-0800",
                       "duration": 1.0531516869999999
                   },
                   {
                       "event": "reached_pg",
                       "time": "2023-12-06T08:34:27.783831-0800",
                       "duration": 5.1329999999999998e-06
                   },
                   {
                       "event": "waiting for rw locks",
                       "time": "2023-12-06T08:34:27.783836-0800",
                       "duration": 1.232525088
                   },
                   {
                       "event": "reached_pg",
                       "time": "2023-12-06T08:34:29.016361-0800",
                       "duration": 3.844e-06
                   },
                   {
                       "event": "waiting for rw locks",
                       "time": "2023-12-06T08:34:29.016365-0800",
                       "duration": 0.0051385700000000003
                   },
                   {
                       "event": "reached_pg",
                       "time": "2023-12-06T08:34:29.021503-0800",
                       "duration": 4.7600000000000002e-06
                   },
                   {
                       "event": "waiting for rw locks",
                       "time": "2023-12-06T08:34:29.021508-0800",
                       "duration": 0.0092808779999999994
                   },
                   {
                       "event": "reached_pg",
                       "time": "2023-12-06T08:34:29.030789-0800",
                       "duration": 4.0690000000000003e-06
                   },
                   {
                       "event": "waiting for rw locks",
                       "time": "2023-12-06T08:34:29.030793-0800",
                       "duration": 0.55757725499999999
                   },
                   {
                       "event": "reached_pg",
                       "time": "2023-12-06T08:34:29.588370-0800",
                       "duration": 5.5060000000000003e-06
                   },
                   {
                       "event": "waiting for rw locks",
                       "time": "2023-12-06T08:34:29.588376-0800",
                       "duration": 0.0064168929999999999
                   },
                   {
                       "event": "reached_pg",
                       "time": "2023-12-06T08:34:29.594793-0800",
                       "duration": 7.0690000000000004e-06
                   },
                   {
                       "event": "waiting for rw locks",
                       "time": "2023-12-06T08:34:29.594800-0800",
                       "duration": 0.0026404089999999998
                   },
                   {
                       "event": "reached_pg",
                       "time": "2023-12-06T08:34:29.597440-0800",
                       "duration": 3.3440000000000001e-06
                   },
                   {
                       "event": "waiting for rw locks",
                       "time": "2023-12-06T08:34:29.597444-0800",
                       "duration": 0.0051126670000000004
                   },
                   {
                       "event": "reached_pg",
                       "time": "2023-12-06T08:34:29.602556-0800",
                       "duration": 5.0200000000000002e-06
                   },
                   {
                       "event": "waiting for rw locks",
                       "time": "2023-12-06T08:34:29.602561-0800",
                       "duration": 0.0040569960000000002
                   },
                   {
                       "event": "reached_pg",
                       "time": "2023-12-06T08:34:29.606618-0800",
                       "duration": 5.0989999999999998e-06
                   },
                   {
                       "event": "waiting for rw locks",
                       "time": "2023-12-06T08:34:29.606623-0800",
                       "duration": 0.0068874100000000001
                   },
                   {
                       "event": "reached_pg",
                       "time": "2023-12-06T08:34:29.613511-0800",
                       "duration": 1.4636e-05
                   },
                   {
                       "event": "started",
                       "time": "2023-12-06T08:34:29.613525-0800",
                       "duration": 0.00028943699999999997
                   },
                   {
                       "event": "done",
                       "time": "2023-12-06T08:34:29.613815-0800",
                       "duration": 11.112171102
                   }


Thank you in advance for your assistance.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to