Now they are increasing , Friday I tried to deep-scrubbing manually and
they have been successfully done , but Monday morning I found that they are
increasing to 37 , is it the best to deep-scrubbing manually while we are
using the cluster? if not what is the best to do in order to address that .
Best Regards.
Michel
ceph -s
cluster:
id: cb0caedc-eb5b-42d1-a34f-96facfda8c27
health: HEALTH_WARN
37 pgs not deep-scrubbed in time
services:
mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 11M)
mgr: ceph-mon2(active, since 11M), standbys: ceph-mon3, ceph-mon1
osd: 48 osds: 48 up (since 11M), 48 in (since 11M)
rgw: 6 daemons active (6 hosts, 1 zones)
data:
pools: 10 pools, 385 pgs
objects: 6.00M objects, 23 TiB
usage: 151 TiB used, 282 TiB / 433 TiB avail
pgs: 381 active+clean
4 active+clean+scrubbing+deep
io:
client: 265 MiB/s rd, 786 MiB/s wr, 3.87k op/s rd, 699 op/s wr
On Sun, Jan 28, 2024 at 6:14 PM E Taka <[email protected]> wrote:
> 22 is more often there than the others. Other operations may be blocked
> because of a deep-scrub is not finished yet. I would remove OSD 22, just to
> be sure about this: ceph orch osd rm osd.22
>
> If this does not help, just add it again.
>
> Am Fr., 26. Jan. 2024 um 08:05 Uhr schrieb Michel Niyoyita <
> [email protected]>:
>
>> It seems that are different OSDs as shown here . how have you managed to
>> sort this out?
>>
>> ceph pg dump | grep -F 6.78
>> dumped all
>> 6.78 44268 0 0 0 0
>> 178679640118 0 0 10099 10099
>> active+clean 2024-01-26T03:51:26.781438+0200 107547'115445304
>> 107547:225274427 [12,36,37] 12 [12,36,37] 12
>> 106977'114532385 2024-01-24T08:37:53.597331+0200 101161'109078277
>> 2024-01-11T16:07:54.875746+0200 0
>> root@ceph-osd3:~# ceph pg dump | grep -F 6.60
>> dumped all
>> 6.60 44449 0 0 0 0
>> 179484338742 716 36 10097 10097
>> active+clean 2024-01-26T03:50:44.579831+0200 107547'153238805
>> 107547:287193139 [32,5,29] 32 [32,5,29] 32
>> 107231'152689835 2024-01-25T02:34:01.849966+0200 102171'147920798
>> 2024-01-13T19:44:26.922000+0200 0
>> 6.3a 44807 0 0 0 0
>> 180969005694 0 0 10093 10093
>> active+clean 2024-01-26T03:53:28.837685+0200 107547'114765984
>> 107547:238170093 [22,13,11] 22 [22,13,11] 22
>> 106945'113739877 2024-01-24T04:10:17.224982+0200 102863'109559444
>> 2024-01-15T05:31:36.606478+0200 0
>> root@ceph-osd3:~# ceph pg dump | grep -F 6.5c
>> 6.5c 44277 0 0 0 0
>> 178764978230 0 0 10051 10051
>> active+clean 2024-01-26T03:55:23.339584+0200 107547'126480090
>> 107547:264432655 [22,37,30] 22 [22,37,30] 22
>> 107205'125858697 2024-01-24T22:32:10.365869+0200 101941'120957992
>> 2024-01-13T09:07:24.780936+0200 0
>> dumped all
>> root@ceph-osd3:~# ceph pg dump | grep -F 4.12
>> dumped all
>> 4.12 0 0 0 0 0
>> 0 0 0 0 0
>> active+clean 2024-01-24T08:36:48.284388+0200 0'0
>> 107546:152711 [22,19,7] 22 [22,19,7] 22
>> 0'0 2024-01-24T08:36:48.284307+0200 0'0
>> 2024-01-13T09:09:22.176240+0200 0
>> root@ceph-osd3:~# ceph pg dump | grep -F 10.d
>> dumped all
>> 10.d 0 0 0 0 0
>> 0 0 0 0 0
>> active+clean 2024-01-24T04:04:33.641541+0200 0'0
>> 107546:142651 [14,28,1] 14 [14,28,1] 14
>> 0'0 2024-01-24T04:04:33.641451+0200 0'0
>> 2024-01-12T08:04:02.078062+0200 0
>> root@ceph-osd3:~# ceph pg dump | grep -F 5.f
>> dumped all
>> 5.f 0 0 0 0 0
>> 0 0 0 0 0
>> active+clean 2024-01-25T08:19:04.148941+0200 0'0
>> 107546:161331 [11,24,35] 11 [11,24,35] 11
>> 0'0 2024-01-25T08:19:04.148837+0200 0'0
>> 2024-01-12T06:06:00.970665+0200 0
>>
>>
>> On Fri, Jan 26, 2024 at 8:58 AM E Taka <[email protected]> wrote:
>>
>>> We had the same problem. It turned out that one disk was slowly dying.
>>> It was easy to identify by the commands (in your case):
>>>
>>> ceph pg dump | grep -F 6.78
>>> ceph pg dump | grep -F 6.60
>>> …
>>>
>>> This command shows the OSDs of a PG in square brackets. If is there
>>> always the same number, then you've found the OSD which causes the slow
>>> scrubs.
>>>
>>> Am Fr., 26. Jan. 2024 um 07:45 Uhr schrieb Michel Niyoyita <
>>> [email protected]>:
>>>
>>>> Hello team,
>>>>
>>>> I have a cluster in production composed by 3 osds servers with 20 disks
>>>> each deployed using ceph-ansibleand ubuntu OS , and the version is
>>>> pacific
>>>> . These days is in WARN state caused by pgs which are not deep-scrubbed
>>>> in
>>>> time . I tried to deep-scrubbed some pg manually but seems that the
>>>> cluster
>>>> can be slow, would like your assistance in order that my cluster can be
>>>> in
>>>> HEALTH_OK state as before without any interuption of service . The
>>>> cluster
>>>> is used as openstack backend storage.
>>>>
>>>> Best Regards
>>>>
>>>> Michel
>>>>
>>>>
>>>> ceph -s
>>>> cluster:
>>>> id: cb0caedc-eb5b-42d1-a34f-96facfda8c27
>>>> health: HEALTH_WARN
>>>> 6 pgs not deep-scrubbed in time
>>>>
>>>> services:
>>>> mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 11M)
>>>> mgr: ceph-mon2(active, since 11M), standbys: ceph-mon3, ceph-mon1
>>>> osd: 48 osds: 48 up (since 11M), 48 in (since 11M)
>>>> rgw: 6 daemons active (6 hosts, 1 zones)
>>>>
>>>> data:
>>>> pools: 10 pools, 385 pgs
>>>> objects: 5.97M objects, 23 TiB
>>>> usage: 151 TiB used, 282 TiB / 433 TiB avail
>>>> pgs: 381 active+clean
>>>> 4 active+clean+scrubbing+deep
>>>>
>>>> io:
>>>> client: 59 MiB/s rd, 860 MiB/s wr, 155 op/s rd, 665 op/s wr
>>>>
>>>> root@ceph-osd3:~# ceph health detail
>>>> HEALTH_WARN 6 pgs not deep-scrubbed in time
>>>> [WRN] PG_NOT_DEEP_SCRUBBED: 6 pgs not deep-scrubbed in time
>>>> pg 6.78 not deep-scrubbed since 2024-01-11T16:07:54.875746+0200
>>>> pg 6.60 not deep-scrubbed since 2024-01-13T19:44:26.922000+0200
>>>> pg 6.5c not deep-scrubbed since 2024-01-13T09:07:24.780936+0200
>>>> pg 4.12 not deep-scrubbed since 2024-01-13T09:09:22.176240+0200
>>>> pg 10.d not deep-scrubbed since 2024-01-12T08:04:02.078062+0200
>>>> pg 5.f not deep-scrubbed since 2024-01-12T06:06:00.970665+0200
>>>> _______________________________________________
>>>> ceph-users mailing list -- [email protected]
>>>> To unsubscribe send an email to [email protected]
>>>>
>>>
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]