Hi Brad, I got the following:
[root@mgmt01 ~]# ceph health detail HEALTH_ERR 1 pgs inconsistent; 1 scrub errors pg 1.65 is active+clean+inconsistent, acting [62,67,47] 1 scrub errors [root@mgmt01 ~]# rados list-inconsistent-obj 1.65 No scrub information available for pg 1.65 error 2: (2) No such file or directory [root@mgmt01 ~]# rados list-inconsistent-snapset 1.65 No scrub information available for pg 1.65 error 2: (2) No such file or directory Rather odd output, I’d say; not that I understand what that means. I also tried ceph list-inconsistent-pg: [root@mgmt01 ~]# rados lspools rbd cephfs_data cephfs_metadata .rgw.root default.rgw.control default.rgw.data.root default.rgw.gc default.rgw.log ctrl-p prod corp camp dev default.rgw.users.uid default.rgw.users.keys default.rgw.buckets.index default.rgw.buckets.data default.rgw.buckets.non-ec [root@mgmt01 ~]# for i in $(rados lspools); do rados list-inconsistent-pg $i; done [] ["1.65"] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] So, that’d put the inconsistency in the cephfs_data pool. Thank you for your help, -kc K.C. Wong kcw...@verseon.com <mailto:kcw...@verseon.com> M: +1 (408) 769-8235 ----------------------------------------------------- Confidentiality Notice: This message contains confidential information. If you are not the intended recipient and received this message in error, any use or distribution is strictly prohibited. Please also notify us immediately by return e-mail, and delete this message from your computer system. Thank you. ----------------------------------------------------- 4096R/B8995EDE <https://sks-keyservers.net/pks/lookup?op=get&search=0x23A692E9B8995EDE> E527 CBE8 023E 79EA 8BBB 5C77 23A6 92E9 B899 5EDE hkps://hkps.pool.sks-keyservers.net > On Nov 11, 2018, at 5:43 PM, Brad Hubbard <bhubb...@redhat.com> wrote: > > What does "rados list-inconsistent-obj <pg>" say? > > Note that you may have to do a deep scrub to populate the output. > On Mon, Nov 12, 2018 at 5:10 AM K.C. Wong <kcw...@verseon.com> wrote: >> >> Hi folks, >> >> I would appreciate any pointer as to how I can resolve a >> PG stuck in “active+clean+inconsistent” state. This has >> resulted in HEALTH_ERR status for the last 5 days with no >> end in sight. The state got triggered when one of the drives >> in the PG returned I/O error. I’ve since replaced the failed >> drive. >> >> I’m running Jewel (out of centos-release-ceph-jewel) on >> CentOS 7. I’ve tried “ceph pg repair <pg>” and it didn’t seem >> to do anything. I’ve tried even more drastic measures such as >> comparing all the files (using filestore) under that PG_head >> on all 3 copies and then nuking the outlier. Nothing worked. >> >> Many thanks, >> >> -kc >> >> K.C. Wong >> kcw...@verseon.com >> M: +1 (408) 769-8235 >> >> ----------------------------------------------------- >> Confidentiality Notice: >> This message contains confidential information. If you are not the >> intended recipient and received this message in error, any use or >> distribution is strictly prohibited. Please also notify us >> immediately by return e-mail, and delete this message from your >> computer system. Thank you. >> ----------------------------------------------------- >> 4096R/B8995EDE E527 CBE8 023E 79EA 8BBB 5C77 23A6 92E9 B899 5EDE >> hkps://hkps.pool.sks-keyservers.net >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > -- > Cheers, > Brad
signature.asc
Description: Message signed with OpenPGP
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com