[ceph-users] Re: Unable to fix 1 Inconsistent PG

2023-10-11 Thread Wesley Dillingham
Just to be clear, you should remove the osd by stopping the daemon and
marking it out before you repair the PG. The pg may not be able to be
repaired until you remove the bad disk.

1 - identify the bad disk (via scrubs or SMART/dmesg inspection)
2 - stop daemon and mark it out
3 - wait for PG to finish backfill
4 - issue the pg repair

Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Wed, Oct 11, 2023 at 4:38 PM Wesley Dillingham 
wrote:

> If I recall correctly When the acting or up_set of an PG changes the scrub
> information is lost. This was likely lost when you stopped osd.238 and
> changed the sets.
>
> I do not believe based on your initial post you need to be using the
> objectstore tool currently. Inconsistent PGs are a common occurrence and
> can be repaired.
>
> After your most recent post I would get osd.238 back in the cluster unless
> you have reason to believe it is the failing hardware. But it could be any
> of the osds in the following set (from your initial post)
> [238,106,402,266,374,498,590,627,684,73,66]
>
> You should inspect the SMART data and dmesg on the drives and servers
> supporting the above OSDs to determine which one is failing.
>
> After you get the PG back to active+clean+inconsistent (get osd.238 back
> in and it finishes its backfill) you can re-issue a manual deep-scrub of it
> and once that deep-scrub finishes the rados list-inconsistent-obj 15.f4f
> should return and implicate a single osd with errors.
>
> Finally you should issue the PG repair again.
>
> In order to get your manually issued scrubs and repairs to start sooner
> you may want to set the noscrub and nodeep-scrub flags until you can get
> your PG repaired.
>
> As an aside osd_max_scrubs of 9 is too aggressive IMO I would drop that
> back to 3, max
>
>
> Respectfully,
>
> *Wes Dillingham*
> w...@wesdillingham.com
> LinkedIn 
>
>
> On Wed, Oct 11, 2023 at 10:51 AM Siddhit Renake 
> wrote:
>
>> Hello Wes,
>>
>> Thank you for your response.
>>
>> brc1admin:~ # rados list-inconsistent-obj 15.f4f
>> No scrub information available for pg 15.f4f
>>
>> brc1admin:~ # ceph osd ok-to-stop osd.238
>> OSD(s) 238 are ok to stop without reducing availability or risking data,
>> provided there are no other concurrent failures or interventions.
>> 341 PGs are likely to be degraded (but remain available) as a result.
>>
>> Before I proceed with your suggested action plan, needed clarification on
>> below.
>> In order to list all objects residing on the inconsistent PG, we had
>> stopped the primary osd (osd.238) and extracted the list of all objects
>> residing on this osd using ceph-objectstore tool. We notice that that when
>> we stop the osd (osd.238) using systemctl, RGW gateways continuously
>> restarts which is impacting our S3 service availability. This was observed
>> twice when we stopped osd.238 for general maintenance activity w.r.t
>> ceph-objectstore tool. How can we ensure that stopping and marking out
>> osd.238 ( primary osd of inconsistent pg) does not impact RGW service
>> availability ?
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Unable to fix 1 Inconsistent PG

2023-10-11 Thread Wesley Dillingham
If I recall correctly When the acting or up_set of an PG changes the scrub
information is lost. This was likely lost when you stopped osd.238 and
changed the sets.

I do not believe based on your initial post you need to be using the
objectstore tool currently. Inconsistent PGs are a common occurrence and
can be repaired.

After your most recent post I would get osd.238 back in the cluster unless
you have reason to believe it is the failing hardware. But it could be any
of the osds in the following set (from your initial post)
[238,106,402,266,374,498,590,627,684,73,66]

You should inspect the SMART data and dmesg on the drives and servers
supporting the above OSDs to determine which one is failing.

After you get the PG back to active+clean+inconsistent (get osd.238 back in
and it finishes its backfill) you can re-issue a manual deep-scrub of it
and once that deep-scrub finishes the rados list-inconsistent-obj 15.f4f
should return and implicate a single osd with errors.

Finally you should issue the PG repair again.

In order to get your manually issued scrubs and repairs to start sooner you
may want to set the noscrub and nodeep-scrub flags until you can get your
PG repaired.

As an aside osd_max_scrubs of 9 is too aggressive IMO I would drop that
back to 3, max


Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Wed, Oct 11, 2023 at 10:51 AM Siddhit Renake 
wrote:

> Hello Wes,
>
> Thank you for your response.
>
> brc1admin:~ # rados list-inconsistent-obj 15.f4f
> No scrub information available for pg 15.f4f
>
> brc1admin:~ # ceph osd ok-to-stop osd.238
> OSD(s) 238 are ok to stop without reducing availability or risking data,
> provided there are no other concurrent failures or interventions.
> 341 PGs are likely to be degraded (but remain available) as a result.
>
> Before I proceed with your suggested action plan, needed clarification on
> below.
> In order to list all objects residing on the inconsistent PG, we had
> stopped the primary osd (osd.238) and extracted the list of all objects
> residing on this osd using ceph-objectstore tool. We notice that that when
> we stop the osd (osd.238) using systemctl, RGW gateways continuously
> restarts which is impacting our S3 service availability. This was observed
> twice when we stopped osd.238 for general maintenance activity w.r.t
> ceph-objectstore tool. How can we ensure that stopping and marking out
> osd.238 ( primary osd of inconsistent pg) does not impact RGW service
> availability ?
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Unable to fix 1 Inconsistent PG

2023-10-11 Thread Siddhit Renake
Hello Wes,

Thank you for your response.

brc1admin:~ # rados list-inconsistent-obj 15.f4f
No scrub information available for pg 15.f4f

brc1admin:~ # ceph osd ok-to-stop osd.238
OSD(s) 238 are ok to stop without reducing availability or risking data, 
provided there are no other concurrent failures or interventions.
341 PGs are likely to be degraded (but remain available) as a result.

Before I proceed with your suggested action plan, needed clarification on below.
In order to list all objects residing on the inconsistent PG, we had stopped 
the primary osd (osd.238) and extracted the list of all objects residing on 
this osd using ceph-objectstore tool. We notice that that when we stop the osd 
(osd.238) using systemctl, RGW gateways continuously restarts which is 
impacting our S3 service availability. This was observed twice when we stopped 
osd.238 for general maintenance activity w.r.t ceph-objectstore tool. How can 
we ensure that stopping and marking out osd.238 ( primary osd of inconsistent 
pg) does not impact RGW service availability ?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Unable to fix 1 Inconsistent PG

2023-10-10 Thread Wesley Dillingham
In case it's not obvious I forgot a space: "rados list-inconsistent-obj
15.f4f"

Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Tue, Oct 10, 2023 at 4:55 PM Wesley Dillingham 
wrote:

> You likely have a failing disk, what does "rados
> list-inconsistent-obj15.f4f" return?
>
> It should identify the failing osd. Assuming "ceph osd ok-to-stop "
> returns in the affirmative for that osd, you likely need to stop the
> associated osd daemon, then mark it out "ceph osd out  wait for it
> to backfill the inconsistent PG and then re-issue the repair. Then turn to
> replacing the disk.
>
> Respectfully,
>
> *Wes Dillingham*
> w...@wesdillingham.com
> LinkedIn 
>
>
> On Tue, Oct 10, 2023 at 4:46 PM  wrote:
>
>> Hello All,
>> Greetings. We've a Ceph Cluster with the version
>> *ceph version 14.2.16-402-g7d47dbaf4d
>> (7d47dbaf4d0960a2e910628360ae36def84ed913) nautilus (stable)
>>
>>
>> ===
>>
>> Issues: 1 pg in inconsistent state and does not recover.
>>
>> # ceph -s
>>   cluster:
>> id: 30d6f7ee-fa02-4ab3-8a09-9321c8002794
>> health: HEALTH_ERR
>> 2 large omap objects
>> 1 pools have many more objects per pg than average
>> 159224 scrub errors
>> Possible data damage: 1 pg inconsistent
>> 2 pgs not deep-scrubbed in time
>> 2 pgs not scrubbed in time
>>
>> # ceph health detail
>>
>> HEALTH_ERR 2 large omap objects; 1 pools have many more objects per pg
>> than average; 159224 scrub errors; Possible data damage: 1 pg inconsistent;
>> 2 pgs not deep-scrubbed in time; 2 pgs not scrubbed in time
>> LARGE_OMAP_OBJECTS 2 large omap objects
>> 2 large objects found in pool 'default.rgw.log'
>> Search the cluster log for 'Large omap object found' for more details.
>> MANY_OBJECTS_PER_PG 1 pools have many more objects per pg than average
>> pool iscsi-images objects per pg (541376) is more than 14.9829 times
>> cluster average (36133)
>> OSD_SCRUB_ERRORS 159224 scrub errors
>> PG_DAMAGED Possible data damage: 1 pg inconsistent
>> pg 15.f4f is active+clean+inconsistent, acting
>> [238,106,402,266,374,498,590,627,684,73,66]
>> PG_NOT_DEEP_SCRUBBED 2 pgs not deep-scrubbed in time
>> pg 1.5c not deep-scrubbed since 2021-04-05 23:20:13.714446
>> pg 1.55 not deep-scrubbed since 2021-04-11 07:12:37.185074
>> PG_NOT_SCRUBBED 2 pgs not scrubbed in time
>> pg 1.5c not scrubbed since 2023-07-10 21:15:50.352848
>> pg 1.55 not scrubbed since 2023-06-24 10:02:10.038311
>>
>> ==
>>
>>
>> We have implemented below command to resolve it
>>
>> 1. We have ran pg repair command "ceph pg repair 15.f4f
>> 2. We have restarted associated  OSDs that is mapped to pg 15.f4f
>> 3. We tuned osd_max_scrubs value and set it to 9.
>> 4. We have done scrub and deep scrub by ceph pg scrub 15.4f4 & ceph pg
>> deep-scrub 15.f4f
>> 5. We also tried to ceph-objectstore-tool command to fix it
>> ==
>>
>> We have checked the logs of the primary OSD of the respective
>> inconsistent PG and found the below errors.
>> [ERR] : 15.f4fs0 shard 402(2)
>> 15:f2f3fff4:::94a51ddb-a94f-47bc-9068-509e8c09af9a.7862003.20_c%2f4%2fd61%2f885%2f49627697%2f192_1.ts:head
>> : missing
>> /var/log/ceph/ceph-osd.238.log:339:2023-10-06 00:37:06.410 7f65024cb700
>> -1 log_channel(cluster) log [ERR] : 15.f4fs0 shard 266(3)
>> 15:f2f2:::94a51ddb-a94f-47bc-9068-509e8c09af9a.11432468.3_TN8QHE_04.20.2020_08.41%2fCV_MAGNETIC%2fV_274396%2fCHUNK_2440801%2fSFILE_CONTAINER_031.FOLDER%2f3:head
>> : missing
>> /var/log/ceph/ceph-osd.238.log:340:2023-10-06 00:37:06.410 7f65024cb700
>> -1 log_channel(cluster) log [ERR] : 15.f4fs0 shard 402(2)
>> 15:f2f2:::94a51ddb-a94f-47bc-9068-509e8c09af9a.11432468.3_TN8QHE_04.20.2020_08.41%2fCV_MAGNETIC%2fV_274396%2fCHUNK_2440801%2fSFILE_CONTAINER_031.FOLDER%2f3:head
>> : missing
>> /var/log/ceph/ceph-osd.238.log:341:2023-10-06 00:37:06.410 7f65024cb700
>> -1 log_channel(cluster) log [ERR] : 15.f4fs0 shard 590(6)
>> 15:f2f2:::94a51ddb-a94f-47bc-9068-509e8c09af9a.11432468.3_TN8QHE_04.20.2020_08.41%2fCV_MAGNETIC%2fV_274396%2fCHUNK_2440801%2fSFILE_CONTAINER_031.FOLDER%2f3:head
>> : missing
>> ===
>> and also we noticed that the no. of scrub errors in ceph health status
>> are matching with the ERR log entries in the primary OSD logs of the
>> inconsistent PG as below
>> grep -Hn 'ERR' /var/log/ceph/ceph-osd.238.log|wc -l
>> 159226
>> 
>> Ceph is cleaning the scrub errors but rate of scrub repair is very slow
>> (avg of 200 scrub errors per day) ,we want to increase the rate of scrub
>> error repair to finish the cleanup of pending 159224 scrub errors.
>>
>> #ceph pg 15.f4f query
>>
>>
>> {
>> "state": "active+clean+inconsistent",
>> "snap_trimq": "[]",

[ceph-users] Re: Unable to fix 1 Inconsistent PG

2023-10-10 Thread Wesley Dillingham
You likely have a failing disk, what does "rados
list-inconsistent-obj15.f4f" return?

It should identify the failing osd. Assuming "ceph osd ok-to-stop "
returns in the affirmative for that osd, you likely need to stop the
associated osd daemon, then mark it out "ceph osd out  wait for it
to backfill the inconsistent PG and then re-issue the repair. Then turn to
replacing the disk.

Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Tue, Oct 10, 2023 at 4:46 PM  wrote:

> Hello All,
> Greetings. We've a Ceph Cluster with the version
> *ceph version 14.2.16-402-g7d47dbaf4d
> (7d47dbaf4d0960a2e910628360ae36def84ed913) nautilus (stable)
>
>
> ===
>
> Issues: 1 pg in inconsistent state and does not recover.
>
> # ceph -s
>   cluster:
> id: 30d6f7ee-fa02-4ab3-8a09-9321c8002794
> health: HEALTH_ERR
> 2 large omap objects
> 1 pools have many more objects per pg than average
> 159224 scrub errors
> Possible data damage: 1 pg inconsistent
> 2 pgs not deep-scrubbed in time
> 2 pgs not scrubbed in time
>
> # ceph health detail
>
> HEALTH_ERR 2 large omap objects; 1 pools have many more objects per pg
> than average; 159224 scrub errors; Possible data damage: 1 pg inconsistent;
> 2 pgs not deep-scrubbed in time; 2 pgs not scrubbed in time
> LARGE_OMAP_OBJECTS 2 large omap objects
> 2 large objects found in pool 'default.rgw.log'
> Search the cluster log for 'Large omap object found' for more details.
> MANY_OBJECTS_PER_PG 1 pools have many more objects per pg than average
> pool iscsi-images objects per pg (541376) is more than 14.9829 times
> cluster average (36133)
> OSD_SCRUB_ERRORS 159224 scrub errors
> PG_DAMAGED Possible data damage: 1 pg inconsistent
> pg 15.f4f is active+clean+inconsistent, acting
> [238,106,402,266,374,498,590,627,684,73,66]
> PG_NOT_DEEP_SCRUBBED 2 pgs not deep-scrubbed in time
> pg 1.5c not deep-scrubbed since 2021-04-05 23:20:13.714446
> pg 1.55 not deep-scrubbed since 2021-04-11 07:12:37.185074
> PG_NOT_SCRUBBED 2 pgs not scrubbed in time
> pg 1.5c not scrubbed since 2023-07-10 21:15:50.352848
> pg 1.55 not scrubbed since 2023-06-24 10:02:10.038311
>
> ==
>
>
> We have implemented below command to resolve it
>
> 1. We have ran pg repair command "ceph pg repair 15.f4f
> 2. We have restarted associated  OSDs that is mapped to pg 15.f4f
> 3. We tuned osd_max_scrubs value and set it to 9.
> 4. We have done scrub and deep scrub by ceph pg scrub 15.4f4 & ceph pg
> deep-scrub 15.f4f
> 5. We also tried to ceph-objectstore-tool command to fix it
> ==
>
> We have checked the logs of the primary OSD of the respective inconsistent
> PG and found the below errors.
> [ERR] : 15.f4fs0 shard 402(2)
> 15:f2f3fff4:::94a51ddb-a94f-47bc-9068-509e8c09af9a.7862003.20_c%2f4%2fd61%2f885%2f49627697%2f192_1.ts:head
> : missing
> /var/log/ceph/ceph-osd.238.log:339:2023-10-06 00:37:06.410 7f65024cb700 -1
> log_channel(cluster) log [ERR] : 15.f4fs0 shard 266(3)
> 15:f2f2:::94a51ddb-a94f-47bc-9068-509e8c09af9a.11432468.3_TN8QHE_04.20.2020_08.41%2fCV_MAGNETIC%2fV_274396%2fCHUNK_2440801%2fSFILE_CONTAINER_031.FOLDER%2f3:head
> : missing
> /var/log/ceph/ceph-osd.238.log:340:2023-10-06 00:37:06.410 7f65024cb700 -1
> log_channel(cluster) log [ERR] : 15.f4fs0 shard 402(2)
> 15:f2f2:::94a51ddb-a94f-47bc-9068-509e8c09af9a.11432468.3_TN8QHE_04.20.2020_08.41%2fCV_MAGNETIC%2fV_274396%2fCHUNK_2440801%2fSFILE_CONTAINER_031.FOLDER%2f3:head
> : missing
> /var/log/ceph/ceph-osd.238.log:341:2023-10-06 00:37:06.410 7f65024cb700 -1
> log_channel(cluster) log [ERR] : 15.f4fs0 shard 590(6)
> 15:f2f2:::94a51ddb-a94f-47bc-9068-509e8c09af9a.11432468.3_TN8QHE_04.20.2020_08.41%2fCV_MAGNETIC%2fV_274396%2fCHUNK_2440801%2fSFILE_CONTAINER_031.FOLDER%2f3:head
> : missing
> ===
> and also we noticed that the no. of scrub errors in ceph health status are
> matching with the ERR log entries in the primary OSD logs of the
> inconsistent PG as below
> grep -Hn 'ERR' /var/log/ceph/ceph-osd.238.log|wc -l
> 159226
> 
> Ceph is cleaning the scrub errors but rate of scrub repair is very slow
> (avg of 200 scrub errors per day) ,we want to increase the rate of scrub
> error repair to finish the cleanup of pending 159224 scrub errors.
>
> #ceph pg 15.f4f query
>
>
> {
> "state": "active+clean+inconsistent",
> "snap_trimq": "[]",
> "snap_trimq_len": 0,
> "epoch": 409009,
> "up": [
> 238,
> 106,
> 402,
> 266,
> 374,
> 498,
> 590,
> 627,
> 684,
> 73,
> 66
> ],
> "acting": [
> 238,
> 106,
> 402,
> 266,
> 374,
> 498,
> 590,
> 627,