http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-mon/#recovery-using-osds
On Thu, May 31, 2018 at 1:49 PM Leônidas Villeneuve
wrote:
> I had a small Ceph cluster and had to take down one node. The data from
> its OSDs was reallocated on the other OSDs and went fine.
>
> After the reallocation, I removed its mon.service as described by the
> official documentation.
>
> Then, everything went wrong. The other mons just collapsed and stopped
> talking to mgrs. The mgr dashboard still works but has outdated data. The
> osds are still up and rbd volumes are working too, but the mons can't get
> online.
>
> After trying everything described by the troubleshooter, removing the old
> mon from monmap, I couldn't inject the new monmap because of lock errors in
> store.db. When I finally injected the new monmap, the mon refused to get
> up. I tried this setting on other mons and got the same results. And, to my
> despair, the store.db ended up being corrupted.
>
> I finally gave up and (after backing up the store.db), deleted the mons
> and started fresh new ones. That worked, but the new mons now have no OSDs
> or hosts mapped to them. I have an old crush map and that's all.
>
> But, since the OSDs are still up, is it possible to rebuild the map and
> all the data needed for mons to start working again from then? That's the
> last resource I have.
>
> Putting it in another way, I have OSDs services and OSD data but no
> monitor and no mgr and need to put them back running. Any tips will be
> appreciated.
>
> Thanks.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com