Thanks for the info , paul. Our cluster is 130gb in size, at present. We are starting out in ceph adoption in our company.
At present, I am looking for guidance from the community. It ll help us, as well in learning more about the product and available support. Thanks, On Fri, 10 Aug 2018 at 9:52 PM, Paul Emmerich <paul.emmer...@croit.io> wrote: > Sorry, a step-by-step guide through something like that > is beyond the scope of what we can do on a mailing list. > > But what I would do here is carefully asses the situation/ > the damage. My wild guess would be to reset and rebuild > the inode table but that might be incorrect and unsafe > without further looking into it. > > I don't want to solicit our services here, but we do Ceph > recoveries regularly; reach out to us if you are looking > for a consultant. > > > > Paul > > > 2018-08-10 18:05 GMT+02:00 Amit Handa <amit.ha...@gmail.com>: > >> Thanks alot, Paul. >> we did (hopefully) follow through with the disaster recovery. >> however, please guide me in how to get the cluster back up ! >> >> Thanks, >> >> >> On Fri, Aug 10, 2018 at 9:32 PM Paul Emmerich <paul.emmer...@croit.io> >> wrote: >> >>> Looks like you got some duplicate inodes due to corrupted metadata, you >>> likely tried to a disaster recovery and didn't follow through it >>> completely or >>> you hit some bug in Ceph. >>> >>> The solution here is probably to do a full recovery of the metadata/full >>> backwards scan after resetting the inodes. I've recovered a cluster from >>> something similar just a few weeks ago. Annoying but recoverable. >>> >>> Paul >>> >>> 2018-08-10 13:26 GMT+02:00 Amit Handa <amit.ha...@gmail.com>: >>> >>>> We are facing constant crash from ceph mds. We have installed mimic >>>> (v13.2.1). >>>> >>>> mds: cephfs-1/1/1 up {0=node2=up:active(laggy or crashed)} >>>> >>>> *mds logs: https://pastebin.com/AWGMLRm0 >>>> <https://pastebin.com/AWGMLRm0>* >>>> >>>> we have followed the DR steps listed at >>>> >>>> *http://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/ >>>> <http://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/> * >>>> >>>> please help in resolving the errors :( >>>> >>>> mds crash stacktrace >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> * ceph version 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic >>>> (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char >>>> const*)+0xff) [0x7f984fc3ee1f] 2: (()+0x284fe7) [0x7f984fc3efe7] 3: >>>> (()+0x2087fe) [0x5563e88537fe] 4: >>>> (Server::prepare_new_inode(boost::intrusive_ptr<MDRequestImpl>&, CDir*, >>>> inodeno_t, unsigned int, file_layout_t*)+0xf37) [0x5563e87ce777] 5: >>>> (Server::handle_client_openc(boost::intrusive_ptr<MDRequestImpl>&)+0xdb0) >>>> [0x5563e87d0bd0] 6: (Server::handle_client_request(MClientRequest*)+0x49e) >>>> [0x5563e87d3c0e] 7: (Server::dispatch(Message*)+0x2db) [0x5563e87d789b] 8: >>>> (MDSRank::handle_deferrable_message(Message*)+0x434) [0x5563e87514b4] 9: >>>> (MDSRank::_dispatch(Message*, bool)+0x63b) [0x5563e875db5b] 10: >>>> (MDSRank::retry_dispatch(Message*)+0x12) [0x5563e875e302] 11: >>>> (MDSInternalContextBase::complete(int)+0x67) [0x5563e89afb57] 12: >>>> (MDSRank::_advance_queues()+0xd1) [0x5563e875cd51] 13: >>>> (MDSRank::ProgressThread::entry()+0x43) [0x5563e875d3e3] 14: (()+0x7e25) >>>> [0x7f984d869e25] 15: (clone()+0x6d) [0x7f984c949bad] NOTE: a copy of the >>>> executable, or `objdump -rdS <executable>` is needed to interpret this.* >>>> >>>> -- >>>> Loading ... >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@lists.ceph.com >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>>> >>> >>> >>> -- >>> Paul Emmerich >>> >>> Looking for help with your Ceph cluster? Contact us at https://croit.io >>> >>> croit GmbH >>> Freseniusstr. 31h >>> <https://maps.google.com/?q=Freseniusstr.+31h+81247+M%C3%BCnchen&entry=gmail&source=g> >>> 81247 München >>> <https://maps.google.com/?q=Freseniusstr.+31h+81247+M%C3%BCnchen&entry=gmail&source=g> >>> www.croit.io >>> Tel: +49 89 1896585 90 >>> >> >> >> -- >> Loading ... >> > > > > -- > Paul Emmerich > > Looking for help with your Ceph cluster? Contact us at https://croit.io > > croit GmbH > Freseniusstr. 31h > <https://maps.google.com/?q=Freseniusstr.+31h+81247+M%C3%BCnchen&entry=gmail&source=g> > 81247 München > <https://maps.google.com/?q=Freseniusstr.+31h+81247+M%C3%BCnchen&entry=gmail&source=g> > www.croit.io > Tel: +49 89 1896585 90 >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com