Just having reliable hardware isn’t enough for monitor failures. I’ve had a case where a wrongly typed command Brought down all three monitors via segfault and no way to bring them back since the command caused the monitor Database to be corrupt. I wish there was a checkpoint implemented in the monitor database so we can revert back Changes. I’m not even sure a regular backup of the monitor database, say every 5 minute would have helped as it could Still cause out of sync issue between the OSD and Monitor. I’ve also tried the method of restoring the monitor database Via ceph-objectstore-tool but just end up with out of sync OSD and monitors where the monitor thinks the OSD is off line But OSD is up, not to mention PGs were all out of whack as well.
https://tracker.ceph.com/issues/22847 -- Efficiency is Intelligent Laziness From: ceph-users <[email protected]> on behalf of Caspar Smit <[email protected]> Date: Tuesday, May 22, 2018 at 7:05 AM To: ceph-users <[email protected]> Subject: Re: [ceph-users] Data recovery after loosing all monitors 2018-05-22 15:51 GMT+02:00 Wido den Hollander <[email protected]<mailto:[email protected]>>: On 05/22/2018 03:38 PM, George Shuklin wrote: > Good news, it's not an emergency, just a curiosity. > > Suppose I lost all monitors in a ceph cluster in my laboratory. I have > all OSDs intact. Is it possible to recover something from Ceph? Yes, there is. Using ceph-objectstore-tool you are able to rebuild the MON database. BUT, this isn't something you would really want to do as you loose your cephx keys and such and getting them all back will be a total nightmare. My advice, make sure you have reliable hardware for your Monitors. Run them on DC-grade SSDs and you'll be fine. And be sure to have enough space available on them to sustain a long period of PGS not being active+clean. Kind regards, Caspar Wido > > _______________________________________________ > ceph-users mailing list > [email protected]<mailto:[email protected]> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=8-PrUTevTN6k7Tl3nH9Gm-Cd_teurkDKr3VHRc5ZqM4&m=JE-FWAG21_3sZ28nktwarU2O_vpgx4nlQQ1V_T9NKyc&s=VsdnFvMZ-rOH-Z6tNm7aOKbl8ehcmySgAc97iv3Czu0&e=> _______________________________________________ ceph-users mailing list [email protected]<mailto:[email protected]> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=8-PrUTevTN6k7Tl3nH9Gm-Cd_teurkDKr3VHRc5ZqM4&m=JE-FWAG21_3sZ28nktwarU2O_vpgx4nlQQ1V_T9NKyc&s=VsdnFvMZ-rOH-Z6tNm7aOKbl8ehcmySgAc97iv3Czu0&e=>
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
