Hi,
I have a 3x-replicated pool with Ceph 12.2.7.
One HDD broke, its OSD "2" was automatically marked as "out", the disk was
physically replaced by a new one, and that added back in.
Now `ceph health detail` continues to permanently show:
[ERR] OSD_SCRUB_ERRORS: 1 scrub errors
[ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
pg 2.87 is active+clean+inconsistent, acting [33,2,20]
What exactly is wrong here?
Why can Ceph not fix the issue?
With BlueStore I have checksums, on two unbroken disks, so what remaining
inconsistency can there be?
The suggested command in
https://docs.ceph.com/en/pacific/rados/operations/pg-repair/#commands-for-diagnosing-pg-problems
does not work:
# rados list-inconsistent-obj 2.87
No scrub information available for pg 2.87
error 2: (2) No such file or directory
Further, I find the documentation in
https://docs.ceph.com/en/pacific/rados/operations/pg-repair/#more-information-on-pg-repair
extremely unclear.
It says
In the case of replicated pools, recovery is beyond the scope of pg repair.
while many people on the Internet suggest that `ceph pg repair` might fix the
issue.
Yet again others claim that Ceph will fix the issue itself.
I am hesitant to run "ceph pg repair" without understanding what the problem is
and what exactly this will do.
I have already reported the "error 2" and the documentation in issue
https://tracker.ceph.com/issues/61739 but not received a reply yet, and my cluster stays
"inconsistent".
How can this be fixed?
I would appreciate any help!
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]