We just had metadata damage show up on our Jewel cluster. I tried a few
things like renaming directories and scanning, but the damage would just
show up again in less than 24 hours. I finally just copied the directories
with the damage to a tmp location on CephFS, then swapped it with the
damaged one. When I deleted the directories with the damage the active MDS
crashed, but the replay took over just fine. I haven't had the messages now
for almost a week.
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Mon, Aug 19, 2019 at 10:30 PM Lars Täuber <taeu...@bbaw.de> wrote:

> Hi there!
>
> Does anyone else have an idea what I could do to get rid of this error?
>
> BTW: it is the third time that the pg 20.0 is gone inconsistent.
> This is a pg from the metadata pool (cephfs).
> May this be related anyhow?
>
> # ceph health detail
> HEALTH_ERR 1 MDSs report damaged metadata; 1 scrub errors; Possible data
> damage: 1 pg inconsistent
> MDS_DAMAGE 1 MDSs report damaged metadata
>     mdsmds3(mds.0): Metadata damage detected
> OSD_SCRUB_ERRORS 1 scrub errors
> PG_DAMAGED Possible data damage: 1 pg inconsistent
>     pg 20.0 is active+clean+inconsistent, acting [9,27,15]
>
>
> Best regards,
> Lars
>
>
> Mon, 19 Aug 2019 13:51:59 +0200
> Lars Täuber <taeu...@bbaw.de> ==> Paul Emmerich <paul.emmer...@croit.io> :
> > Hi Paul,
> >
> > thanks for the hint.
> >
> > I did a recursive scrub from "/". The log says there where some inodes
> with bad backtraces repaired. But the error remains.
> > May this have something to do with a deleted file? Or a file within a
> snapshot?
> >
> > The path told by
> >
> > # ceph tell mds.mds3 damage ls
> > 2019-08-19 13:43:04.608 7f563f7f6700  0 client.894552 ms_handle_reset on
> v2:192.168.16.23:6800/176704036
> > 2019-08-19 13:43:04.624 7f56407f8700  0 client.894558 ms_handle_reset on
> v2:192.168.16.23:6800/176704036
> > [
> >     {
> >         "damage_type": "backtrace",
> >         "id": 3760765989,
> >         "ino": 1099518115802,
> >         "path": "~mds0/stray7/100005161f7/dovecot.index.backup"
> >     }
> > ]
> >
> > starts a bit strange to me.
> >
> > Are the snapshots also repaired with a recursive repair operation?
> >
> > Thanks
> > Lars
> >
> >
> > Mon, 19 Aug 2019 13:30:53 +0200
> > Paul Emmerich <paul.emmer...@croit.io> ==> Lars Täuber <taeu...@bbaw.de>
> :
> > > Hi,
> > >
> > > that error just says that the path is wrong. I unfortunately don't
> > > know the correct way to instruct it to scrub a stray path off the top
> > > of my head; you can always run a recursive scrub on / to go over
> > > everything, though
> > >
> > >
> > > Paul
> > >
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to