`ceph versions` reports:
{ "mon": { "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable)": 3 }, "mgr": { "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable)": 3 }, "osd": { "ceph version 12.2.10 (177915764b752804194937482a39e95e0ca3de94) luminous (stable)": 197, "ceph version 12.2.9 (9e300932ef8a8916fb3fda78c58691a6ab0f4217) luminous (stable)": 11 }, "mds": { "ceph version 12.2.10 (177915764b752804194937482a39e95e0ca3de94) luminous (stable)": 2, "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable)": 1 }, "overall": { "ceph version 12.2.10 (177915764b752804194937482a39e95e0ca3de94) luminous (stable)": 199, "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable)": 7, "ceph version 12.2.9 (9e300932ef8a8916fb3fda78c58691a6ab0f4217) luminous (stable)": 11 } } I didn't realize we were in such a weird state with versions, we'll update all those to 12.2.10 today :) ________________________________ From: Yan, Zheng <uker...@gmail.com> Sent: Tuesday, April 2, 2019 20:26 To: Sergey Malinin Cc: Pickett, Neale T; ceph-users Subject: Re: [ceph-users] MDS allocates all memory (>500G) replaying, OOM-killed, repeat Looks like http://tracker.ceph.com/issues/37399. which version of ceph-mds do you use? On Tue, Apr 2, 2019 at 7:47 AM Sergey Malinin <ad...@data-center.com> wrote: > > These steps pretty well correspond to > http://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/ > Were you able to replay journal manually with no issues? IIRC, > "cephfs-journal-tool recover_dentries" would lead to OOM in case of MDS doing > so, and it has already been discussed on this list. > > > April 2, 2019 1:37 AM, "Pickett, Neale T" <ne...@lanl.gov> wrote: > > Here is what I wound up doing to fix this: > > Bring down all MDSes so they stop flapping > Back up journal (as seen in previous message) > Apply journal manually > Reset journal manually > Clear session table > Clear other tables (not sure I needed to do this) > Mark FS down > Mark the rank 0 MDS as failed > Reset the FS (yes, I really mean it) > Restart MDSes > Finally get some sleep > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com