So the cluster has been dead and down since around 8/10/2016. I have since rebooted the cluster in order to try and use the new ceph-monstore-tool rebuild functionality.
I built the debian packages for the tools for hammer that were recently backported and installed it across all of the servers: root@kh08-8:/home/lacadmin# ceph --version ceph version 0.94.9-4530-g83af8cd (83af8cdaaa6d94404e6146b68e532a784e3cc99c) >From here I ran the following: ------------------------------------------------------------------------------ #!/bin/bash set -e store="/home/localadmin/monstore/" rm -rf "${store}" mkdir -p "${store}" for host in kh{08..10}-{1..7}; do rsync -Pav ${store} ${host}:${store} for osd in $(ssh ${host} 'ls /var/lib/ceph/osd/ | grep ceph-*'); do echo "${disk}" ssh ${host} "sudo ceph-objectstore-tool --data-path /var/lib/ceph/osd/${osd} --journal-path /var/lib/ceph/osd/${osd}/journal --op update-mon-db --mon-store-path ${store}" done ssh ${host} "sudo chown lacadmin. ${store}" rsync -Pav ${host}:${store} ${store} done ------------------------------------------------------------------------------ Which generated a 1.1G store.db directory >From here I ran the following (per the github guide -- https://github.com/ceph/ceph/blob/master/doc/rados/troubleshooting/troubleshooting-mon.rst ) ceph-authtool ./admin.keyring -n mon. --cap mon 'allow *' ceph-authtool -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow *' which gave me the following key :: ------------------------------------------------------------------------------ [mon.] key = AAAAAAAAAAAAAAAA caps mon = "allow *" [client.admin] key = AAAAAAAAAAAAAAAA caps mds = "allow *" caps mon = "allow *" caps osd = "allow *" ------------------------------------------------------------------------------ the above looks like it shouldn't work but going with it. I tried using the monstore tool to rebuild based on the monstore grabbed from all 630 of the osds) but I am met with a dump T_T ------------------------------------------------------------------------------ ceph-monstore-tool /home/localadmin/monstore rebuild -- --keyring /home/localadmin/admin.keyring *** Caught signal (Segmentation fault) ** in thread 7f10cd6d88c0 ceph version 0.94.9-4530-g83af8cd (83af8cdaaa6d94404e6146b68e532a784e3cc99c) 1: ceph-monstore-tool() [0x5e960a] 2: (()+0x10330) [0x7f10cc5c8330] 3: (strlen()+0x2a) [0x7f10cac629da] 4: (std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)+0x25) [0x7f10cb576d75] 5: (rebuild_monstore(char const*, std::vector<std::string, std::allocator<std::string> >&, MonitorDBStore&)+0x878) [0x544958] 6: (main()+0x3e05) [0x52c035] 7: (__libc_start_main()+0xf5) [0x7f10cabfbf45] 8: ceph-monstore-tool() [0x540347] 2017-02-06 17:35:59.885651 7f10cd6d88c0 -1 *** Caught signal (Segmentation fault) ** in thread 7f10cd6d88c0 ceph version 0.94.9-4530-g83af8cd (83af8cdaaa6d94404e6146b68e532a784e3cc99c) 1: ceph-monstore-tool() [0x5e960a] 2: (()+0x10330) [0x7f10cc5c8330] 3: (strlen()+0x2a) [0x7f10cac629da] 4: (std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)+0x25) [0x7f10cb576d75] 5: (rebuild_monstore(char const*, std::vector<std::string, std::allocator<std::string> >&, MonitorDBStore&)+0x878) [0x544958] 6: (main()+0x3e05) [0x52c035] 7: (__libc_start_main()+0xf5) [0x7f10cabfbf45] 8: ceph-monstore-tool() [0x540347] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- begin dump of recent events --- -15> 2017-02-06 17:35:54.362066 7f10cd6d88c0 5 asok(0x355a000) register_command perfcounters_dump hook 0x350a0d0 -14> 2017-02-06 17:35:54.362122 7f10cd6d88c0 5 asok(0x355a000) register_command 1 hook 0x350a0d0 -13> 2017-02-06 17:35:54.362137 7f10cd6d88c0 5 asok(0x355a000) register_command perf dump hook 0x350a0d0 -12> 2017-02-06 17:35:54.362147 7f10cd6d88c0 5 asok(0x355a000) register_command perfcounters_schema hook 0x350a0d0 -11> 2017-02-06 17:35:54.362157 7f10cd6d88c0 5 asok(0x355a000) register_command 2 hook 0x350a0d0 -10> 2017-02-06 17:35:54.362161 7f10cd6d88c0 5 asok(0x355a000) register_command perf schema hook 0x350a0d0 -9> 2017-02-06 17:35:54.362170 7f10cd6d88c0 5 asok(0x355a000) register_command perf reset hook 0x350a0d0 -8> 2017-02-06 17:35:54.362179 7f10cd6d88c0 5 asok(0x355a000) register_command config show hook 0x350a0d0 -7> 2017-02-06 17:35:54.362188 7f10cd6d88c0 5 asok(0x355a000) register_command config set hook 0x350a0d0 -6> 2017-02-06 17:35:54.362193 7f10cd6d88c0 5 asok(0x355a000) register_command config get hook 0x350a0d0 -5> 2017-02-06 17:35:54.362202 7f10cd6d88c0 5 asok(0x355a000) register_command config diff hook 0x350a0d0 -4> 2017-02-06 17:35:54.362207 7f10cd6d88c0 5 asok(0x355a000) register_command log flush hook 0x350a0d0 -3> 2017-02-06 17:35:54.362215 7f10cd6d88c0 5 asok(0x355a000) register_command log dump hook 0x350a0d0 -2> 2017-02-06 17:35:54.362220 7f10cd6d88c0 5 asok(0x355a000) register_command log reopen hook 0x350a0d0 -1> 2017-02-06 17:35:54.379684 7f10cd6d88c0 2 auth: KeyRing::load: loaded key file /home/lacadmin/admin.keyring 0> 2017-02-06 17:35:59.885651 7f10cd6d88c0 -1 *** Caught signal (Segmentation fault) ** in thread 7f10cd6d88c0 ceph version 0.94.9-4530-g83af8cd (83af8cdaaa6d94404e6146b68e532a784e3cc99c) 1: ceph-monstore-tool() [0x5e960a] 2: (()+0x10330) [0x7f10cc5c8330] 3: (strlen()+0x2a) [0x7f10cac629da] 4: (std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)+0x25) [0x7f10cb576d75] 5: (rebuild_monstore(char const*, std::vector<std::string, std::allocator<std::string> >&, MonitorDBStore&)+0x878) [0x544958] 6: (main()+0x3e05) [0x52c035] 7: (__libc_start_main()+0xf5) [0x7f10cabfbf45] 8: ceph-monstore-tool() [0x540347] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- logging levels --- 0/ 5 none 0/ 1 lockdep 0/ 1 context 1/ 1 crush 1/ 5 mds 1/ 5 mds_balancer 1/ 5 mds_locker 1/ 5 mds_log 1/ 5 mds_log_expire 1/ 5 mds_migrator 0/ 1 buffer 0/ 1 timer 0/ 1 filer 0/ 1 striper 0/ 1 objecter 0/ 5 rados 0/ 5 rbd 0/ 5 rbd_replay 0/ 5 journaler 0/ 5 objectcacher 0/ 5 client 0/ 5 osd 0/ 5 optracker 0/ 5 objclass 1/ 3 filestore 1/ 3 keyvaluestore 1/ 3 journal 1/ 1 ms 10/10 mon 0/10 monc 1/ 5 paxos 0/ 5 tp 1/ 5 auth 1/ 5 crypto 1/ 1 finisher 1/ 5 heartbeatmap 1/ 5 perfcounter 1/ 5 rgw 1/10 civetweb 1/ 5 javaclient 1/ 5 asok 1/ 1 throttle 0/ 0 refs 1/ 5 xio -2/-2 (syslog threshold) 99/99 (stderr threshold) max_recent 500 max_new 1000 log_file --- end dump of recent events --- Segmentation fault (core dumped) ------------------------------------------------------------------------------ I have tried copying my monitor and admin keyring into the admin.keyring used to try to rebuild and it still fails. I am not sure whether this is due to my packages or if something else is wrong. Is there a way to test or see what may be happening? On Sat, Aug 13, 2016 at 10:36 PM, Sean Sullivan <seapasu...@uchicago.edu> wrote: > So with a patched leveldb to skip errors I now have a store.db that I can > extract the pg,mon,and osd map from. That said when I try to start kh10-8 > it bombs out:: > > --------------------------------------- > --------------------------------------- > root@kh10-8:/var/lib/ceph/mon/ceph-kh10-8# ceph-mon -i $(hostname) -d > 2016-08-13 22:30:54.596039 7fa8b9e088c0 0 ceph version 0.94.7 ( > d56bdf93ced6b80b07397d57e3fa68fe68304432), process ceph-mon, pid 708653 > starting mon.kh10-8 rank 2 at 10.64.64.125:6789/0 mon_data > /var/lib/ceph/mon/ceph-kh10-8 fsid e452874b-cb29-4468-ac7f-f8901dfccebf > 2016-08-13 22:30:54.608150 7fa8b9e088c0 0 starting mon.kh10-8 rank 2 at > 10.64.64.125:6789/0 mon_data /var/lib/ceph/mon/ceph-kh10-8 fsid > e452874b-cb29-4468-ac7f-f8901dfccebf > 2016-08-13 22:30:54.608395 7fa8b9e088c0 1 mon.kh10-8@-1(probing) e1 > preinit fsid e452874b-cb29-4468-ac7f-f8901dfccebf > 2016-08-13 22:30:54.608617 7fa8b9e088c0 1 > mon.kh10-8@-1(probing).paxosservice(pgmap > 0..35606392) refresh upgraded, format 0 -> 1 > 2016-08-13 22:30:54.608629 7fa8b9e088c0 1 mon.kh10-8@-1(probing).pg v0 > on_upgrade discarding in-core PGMap > terminate called after throwing an instance of > 'ceph::buffer::end_of_buffer' > what(): buffer::end_of_buffer > *** Caught signal (Aborted) ** > in thread 7fa8b9e088c0 > ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432) > 1: ceph-mon() [0x9b25ea] > 2: (()+0x10330) [0x7fa8b8f0b330] > 3: (gsignal()+0x37) [0x7fa8b73a8c37] > 4: (abort()+0x148) [0x7fa8b73ac028] > 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7fa8b7cb3535] > 6: (()+0x5e6d6) [0x7fa8b7cb16d6] > 7: (()+0x5e703) [0x7fa8b7cb1703] > 8: (()+0x5e922) [0x7fa8b7cb1922] > 9: ceph-mon() [0x853c39] > 10: (object_stat_collection_t::decode(ceph::buffer::list::iterator&)+0x167) > [0x894227] > 11: (pg_stat_t::decode(ceph::buffer::list::iterator&)+0x5ff) [0x894baf] > 12: (PGMap::update_pg(pg_t, ceph::buffer::list&)+0xa3) [0x91a8d3] > 13: (PGMonitor::read_pgmap_full()+0x1d8) [0x68b9b8] > 14: (PGMonitor::update_from_paxos(bool*)+0xbf7) [0x6977b7] > 15: (PaxosService::refresh(bool*)+0x19a) [0x605b5a] > 16: (Monitor::refresh_from_paxos(bool*)+0x1db) [0x5b1ffb] > 17: (Monitor::init_paxos()+0x85) [0x5b2365] > 18: (Monitor::preinit()+0x7d7) [0x5b6f87] > 19: (main()+0x230c) [0x57853c] > 20: (__libc_start_main()+0xf5) [0x7fa8b7393f45] > 21: ceph-mon() [0x59a3c7] > 2016-08-13 22:30:54.611791 7fa8b9e088c0 -1 *** Caught signal (Aborted) ** > in thread 7fa8b9e088c0 > > ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432) > 1: ceph-mon() [0x9b25ea] > 2: (()+0x10330) [0x7fa8b8f0b330] > 3: (gsignal()+0x37) [0x7fa8b73a8c37] > 4: (abort()+0x148) [0x7fa8b73ac028] > 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7fa8b7cb3535] > 6: (()+0x5e6d6) [0x7fa8b7cb16d6] > 7: (()+0x5e703) [0x7fa8b7cb1703] > 8: (()+0x5e922) [0x7fa8b7cb1922] > 9: ceph-mon() [0x853c39] > 10: (object_stat_collection_t::decode(ceph::buffer::list::iterator&)+0x167) > [0x894227] > 11: (pg_stat_t::decode(ceph::buffer::list::iterator&)+0x5ff) [0x894baf] > 12: (PGMap::update_pg(pg_t, ceph::buffer::list&)+0xa3) [0x91a8d3] > 13: (PGMonitor::read_pgmap_full()+0x1d8) [0x68b9b8] > 14: (PGMonitor::update_from_paxos(bool*)+0xbf7) [0x6977b7] > 15: (PaxosService::refresh(bool*)+0x19a) [0x605b5a] > 16: (Monitor::refresh_from_paxos(bool*)+0x1db) [0x5b1ffb] > 17: (Monitor::init_paxos()+0x85) [0x5b2365] > 18: (Monitor::preinit()+0x7d7) [0x5b6f87] > 19: (main()+0x230c) [0x57853c] > 20: (__libc_start_main()+0xf5) [0x7fa8b7393f45] > 21: ceph-mon() [0x59a3c7] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed > to interpret this. > > --- begin dump of recent events --- > -33> 2016-08-13 22:30:54.593450 7fa8b9e088c0 5 asok(0x36a20f0) > register_command perfcounters_dump hook 0x365a050 > -32> 2016-08-13 22:30:54.593480 7fa8b9e088c0 5 asok(0x36a20f0) > register_command 1 hook 0x365a050 > -31> 2016-08-13 22:30:54.593486 7fa8b9e088c0 5 asok(0x36a20f0) > register_command perf dump hook 0x365a050 > -30> 2016-08-13 22:30:54.593496 7fa8b9e088c0 5 asok(0x36a20f0) > register_command perfcounters_schema hook 0x365a050 > -29> 2016-08-13 22:30:54.593499 7fa8b9e088c0 5 asok(0x36a20f0) > register_command 2 hook 0x365a050 > -28> 2016-08-13 22:30:54.593501 7fa8b9e088c0 5 asok(0x36a20f0) > register_command perf schema hook 0x365a050 > -27> 2016-08-13 22:30:54.593503 7fa8b9e088c0 5 asok(0x36a20f0) > register_command perf reset hook 0x365a050 > -26> 2016-08-13 22:30:54.593505 7fa8b9e088c0 5 asok(0x36a20f0) > register_command config show hook 0x365a050 > -25> 2016-08-13 22:30:54.593508 7fa8b9e088c0 5 asok(0x36a20f0) > register_command config set hook 0x365a050 > -24> 2016-08-13 22:30:54.593510 7fa8b9e088c0 5 asok(0x36a20f0) > register_command config get hook 0x365a050 > -23> 2016-08-13 22:30:54.593512 7fa8b9e088c0 5 asok(0x36a20f0) > register_command config diff hook 0x365a050 > -22> 2016-08-13 22:30:54.593513 7fa8b9e088c0 5 asok(0x36a20f0) > register_command log flush hook 0x365a050 > -21> 2016-08-13 22:30:54.593557 7fa8b9e088c0 5 asok(0x36a20f0) > register_command log dump hook 0x365a050 > -20> 2016-08-13 22:30:54.593561 7fa8b9e088c0 5 asok(0x36a20f0) > register_command log reopen hook 0x365a050 > -19> 2016-08-13 22:30:54.596039 7fa8b9e088c0 0 ceph version 0.94.7 ( > d56bdf93ced6b80b07397d57e3fa68fe68304432), process ceph-mon, pid 708653 > -18> 2016-08-13 22:30:54.597587 7fa8b9e088c0 5 asok(0x36a20f0) init > /var/run/ceph/ceph-mon.kh10-8.asok > -17> 2016-08-13 22:30:54.597601 7fa8b9e088c0 5 asok(0x36a20f0) > bind_and_listen /var/run/ceph/ceph-mon.kh10-8.asok > -16> 2016-08-13 22:30:54.597767 7fa8b9e088c0 5 asok(0x36a20f0) > register_command 0 hook 0x36560c0 > -15> 2016-08-13 22:30:54.597775 7fa8b9e088c0 5 asok(0x36a20f0) > register_command version hook 0x36560c0 > -14> 2016-08-13 22:30:54.597778 7fa8b9e088c0 5 asok(0x36a20f0) > register_command git_version hook 0x36560c0 > -13> 2016-08-13 22:30:54.597781 7fa8b9e088c0 5 asok(0x36a20f0) > register_command help hook 0x365a150 > -12> 2016-08-13 22:30:54.597783 7fa8b9e088c0 5 asok(0x36a20f0) > register_command get_command_descriptions hook 0x365a140 > -11> 2016-08-13 22:30:54.597860 7fa8b5181700 5 asok(0x36a20f0) entry > start > -10> 2016-08-13 22:30:54.608150 7fa8b9e088c0 0 starting mon.kh10-8 > rank 2 at 10.64.64.125:6789/0 mon_data /var/lib/ceph/mon/ceph-kh10-8 fsid > e452874b-cb29-4468-ac7f-f8901dfccebf > -9> 2016-08-13 22:30:54.608210 7fa8b9e088c0 1 -- 10.64.64.125:6789/0 > learned my addr 10.64.64.125:6789/0 > -8> 2016-08-13 22:30:54.608214 7fa8b9e088c0 1 accepter.accepter.bind > my_inst.addr is 10.64.64.125:6789/0 need_addr=0 > -7> 2016-08-13 22:30:54.608279 7fa8b9e088c0 5 adding auth protocol: > cephx > -6> 2016-08-13 22:30:54.608282 7fa8b9e088c0 5 adding auth protocol: > cephx > -5> 2016-08-13 22:30:54.608311 7fa8b9e088c0 10 log_channel(cluster) > update_config to_monitors: true to_syslog: false syslog_facility: daemon > prio: info) > -4> 2016-08-13 22:30:54.608317 7fa8b9e088c0 10 log_channel(audit) > update_config to_monitors: true to_syslog: false syslog_facility: local0 > prio: info) > -3> 2016-08-13 22:30:54.608395 7fa8b9e088c0 1 mon.kh10-8@-1(probing) > e1 preinit fsid e452874b-cb29-4468-ac7f-f8901dfccebf > -2> 2016-08-13 22:30:54.608617 7fa8b9e088c0 1 > mon.kh10-8@-1(probing).paxosservice(pgmap > 0..35606392) refresh upgraded, format 0 -> 1 > -1> 2016-08-13 22:30:54.608629 7fa8b9e088c0 1 mon.kh10-8@-1(probing).pg > v0 on_upgrade discarding in-core PGMap > 0> 2016-08-13 22:30:54.611791 7fa8b9e088c0 -1 *** Caught signal > (Aborted) ** > in thread 7fa8b9e088c0 > > ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432) > 1: ceph-mon() [0x9b25ea] > 2: (()+0x10330) [0x7fa8b8f0b330] > 3: (gsignal()+0x37) [0x7fa8b73a8c37] > 4: (abort()+0x148) [0x7fa8b73ac028] > 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7fa8b7cb3535] > 6: (()+0x5e6d6) [0x7fa8b7cb16d6] > 7: (()+0x5e703) [0x7fa8b7cb1703] > 8: (()+0x5e922) [0x7fa8b7cb1922] > 9: ceph-mon() [0x853c39] > 10: (object_stat_collection_t::decode(ceph::buffer::list::iterator&)+0x167) > [0x894227] > 11: (pg_stat_t::decode(ceph::buffer::list::iterator&)+0x5ff) [0x894baf] > 12: (PGMap::update_pg(pg_t, ceph::buffer::list&)+0xa3) [0x91a8d3] > 13: (PGMonitor::read_pgmap_full()+0x1d8) [0x68b9b8] > 14: (PGMonitor::update_from_paxos(bool*)+0xbf7) [0x6977b7] > 15: (PaxosService::refresh(bool*)+0x19a) [0x605b5a] > 16: (Monitor::refresh_from_paxos(bool*)+0x1db) [0x5b1ffb] > 17: (Monitor::init_paxos()+0x85) [0x5b2365] > 18: (Monitor::preinit()+0x7d7) [0x5b6f87] > 19: (main()+0x230c) [0x57853c] > 20: (__libc_start_main()+0xf5) [0x7fa8b7393f45] > 21: ceph-mon() [0x59a3c7] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed > to interpret this. > > --- logging levels --- > 0/ 5 none > 0/ 1 lockdep > 0/ 1 context > 1/ 1 crush > 1/ 5 mds > 1/ 5 mds_balancer > 1/ 5 mds_locker > 1/ 5 mds_log > 1/ 5 mds_log_expire > 1/ 5 mds_migrator > 0/ 1 buffer > 0/ 1 timer > 0/ 1 filer > 0/ 1 striper > 0/ 1 objecter > 0/ 5 rados > 0/ 5 rbd > 0/ 5 rbd_replay > 0/ 5 journaler > 0/ 5 objectcacher > 0/ 5 client > 0/ 5 osd > 0/ 5 optracker > 0/ 5 objclass > 1/ 3 filestore > 1/ 3 keyvaluestore > 1/ 3 journal > 0/ 5 ms > 1/ 5 mon > 0/10 monc > 1/ 5 paxos > 0/ 5 tp > 1/ 5 auth > 1/ 5 crypto > 1/ 1 finisher > 1/ 5 heartbeatmap > 1/ 5 perfcounter > 1/ 5 rgw > 1/10 civetweb > 1/ 5 javaclient > 1/ 5 asok > 1/ 1 throttle > 0/ 0 refs > 1/ 5 xio > -2/-2 (syslog threshold) > 99/99 (stderr threshold) > max_recent 10000 > max_new 1000 > log_file > --- end dump of recent events --- > Aborted (core dumped) > --------------------------------------- > --------------------------------------- > > I feel like I am so close but so far. Can anyone give me a nudge as to > what I can do next? it looks like it is bombing out on trying to get an > updated paxos. > > > > On Fri, Aug 12, 2016 at 1:09 PM, Sean Sullivan <seapasu...@uchicago.edu> > wrote: > >> A coworker patched leveldb and we were able to export quite a bit of data >> from kh08's leveldb database. At this point I think I need to re-construct >> a new leveldb with whatever values I can. Is it the same leveldb database >> across all 3 montiors? IE will keys exported from one work in the other? >> All should have the same keys/values although constructed differently >> right? I can't blindly copy /var/lib/ceph/mon/ceph-$(hostname)/store.db/ >> from one host to another right? But can I copy the keys/values from one to >> another? >> >> On Fri, Aug 12, 2016 at 12:45 PM, Sean Sullivan <seapasu...@uchicago.edu> >> wrote: >> >>> ceph-monstore-tool? Is that the same as monmaptool? oops! NM found it in >>> ceph-test package:: >>> >>> I can't seem to get it working :-( dump monmap or any of the commands. >>> They all bomb out with the same message: >>> >>> root@kh10-8:/var/lib/ceph/mon/ceph-kh10-8/store.db# ceph-monstore-tool >>> /var/lib/ceph/mon/ceph-kh10-8 dump-trace -- /tmp/test.trace >>> Corruption: 1 missing files; e.g.: /var/lib/ceph/mon/ceph-kh10-8/ >>> store.db/10882319.ldb >>> root@kh10-8:/var/lib/ceph/mon/ceph-kh10-8/store.db# ceph-monstore-tool >>> /var/lib/ceph/mon/ceph-kh10-8 dump-keys >>> Corruption: 1 missing files; e.g.: /var/lib/ceph/mon/ceph-kh10-8/ >>> store.db/10882319.ldb >>> >>> >>> I need to clarify as I originally had 2 clusters with this issue and now >>> I have 1 with all 3 monitors dead and 1 that I was successfully able to >>> repair. I am about to recap everything I know about the issue and the issue >>> at hand. Should I start a new email thread about this instead? >>> >>> The cluster that is currently having issues is on hammer (94.7), and the >>> monitor stats are the same:: >>> root@kh08-8:~# cat /proc/cpuinfo | grep -iE "model name" | uniq -c >>> 24 model name : Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz >>> ext4 volume comprised of 4x300GB 10k drives in raid 10. >>> ubuntu 14.04 >>> >>> root@kh08-8:~# uname -a >>> Linux kh08-8 3.13.0-76-generic #120-Ubuntu SMP Mon Jan 18 15:59:10 UTC >>> 2016 x86_64 x86_64 x86_64 GNU/Linux >>> root@kh08-8:~# ceph --version >>> ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432) >>> >>> >>> From here: Here are the errors I am getting when starting each of the >>> monitors:: >>> >>> >>> --------------- >>> root@kh08-8:~# /usr/bin/ceph-mon --cluster=ceph -i kh08-8 -d >>> 2016-08-11 22:15:23.731550 7fe5ad3e98c0 0 ceph version 0.94.7 >>> (d56bdf93ced6b80b07397d57e3fa68fe68304432), process ceph-mon, pid 317309 >>> Corruption: error in middle of record >>> 2016-08-11 22:15:28.274340 7fe5ad3e98c0 -1 error opening mon data >>> directory at '/var/lib/ceph/mon/ceph-kh08-8': (22) Invalid argument >>> -- >>> root@kh09-8:~# /usr/bin/ceph-mon --cluster=ceph -i kh09-8 -d >>> 2016-08-11 22:14:28.252370 7f7eaab908c0 0 ceph version 0.94.7 >>> (d56bdf93ced6b80b07397d57e3fa68fe68304432), process ceph-mon, pid 308888 >>> Corruption: 14 missing files; e.g.: /var/lib/ceph/mon/ceph-kh09-8/ >>> store.db/10845998.ldb >>> 2016-08-11 22:14:35.094237 7f7eaab908c0 -1 error opening mon data >>> directory at '/var/lib/ceph/mon/ceph-kh09-8': (22) Invalid argument >>> -- >>> root@kh10-8:/var/lib/ceph/mon/ceph-kh10-8/store.db# /usr/bin/ceph-mon >>> --cluster=ceph -i kh10-8 -d >>> 2016-08-11 22:17:54.632762 7f80bf34d8c0 0 ceph version 0.94.7 >>> (d56bdf93ced6b80b07397d57e3fa68fe68304432), process ceph-mon, pid 292620 >>> Corruption: 1 missing files; e.g.: /var/lib/ceph/mon/ceph-kh10-8/ >>> store.db/10882319.ldb >>> 2016-08-11 22:18:01.207749 7f80bf34d8c0 -1 error opening mon data >>> directory at '/var/lib/ceph/mon/ceph-kh10-8': (22) Invalid argument >>> --------------- >>> >>> >>> for kh08, a coworker patched leveldb to print and skip on the first >>> error and that one is also missing a bunch of files. As such I think kh10-8 >>> is my most likely candidate to recover but either way recovery is probably >>> not an option. I see leveldb has a repair.cc ( >>> https://github.com/google/leveldb/blob/master/db/repair.cc)) but I do >>> not see repair mentioned in monitor in respect to the dbstore. I tried >>> using the leveldb python module (plyvel) to attempt a repair but my repl >>> just ends up dying. >>> >>> I understand two things:: 1.) Without rebuilding the monitor backend >>> leveldb (the cluster map as I understand it) store all of the data in the >>> cluster is essentialy lost (right?) >>> 2.) it is possible to rebuild >>> this database via some form of magic or (source)ry as all of this data is >>> essential held throughout the cluster as well. >>> >>> We only use radosgw / S3 for this cluster. If there is a way to recover >>> my data that is easier//more likely than rebuilding the leveldb of a >>> monitor and starting a single monitor cluster up I would like to switch >>> gears and focus on that. >>> >>> Looking at the dev docs: >>> http://docs.ceph.com/docs/hammer/architecture/#cluster-map >>> it has 5 main parts:: >>> >>> ``` >>> The Monitor Map: Contains the cluster fsid, the position, name address >>> and port of each monitor. It also indicates the current epoch, when the map >>> was created, and the last time it changed. To view a monitor map, execute >>> ceph mon dump. >>> The OSD Map: Contains the cluster fsid, when the map was created and >>> last modified, a list of pools, replica sizes, PG numbers, a list of OSDs >>> and their status (e.g., up, in). To view an OSD map, execute ceph osd dump. >>> The PG Map: Contains the PG version, its time stamp, the last OSD map >>> epoch, the full ratios, and details on each placement group such as the PG >>> ID, the Up Set, the Acting Set, the state of the PG (e.g., active + clean), >>> and data usage statistics for each pool. >>> The CRUSH Map: Contains a list of storage devices, the failure domain >>> hierarchy (e.g., device, host, rack, row, room, etc.), and rules for >>> traversing the hierarchy when storing data. To view a CRUSH map, execute >>> ceph osd getcrushmap -o {filename}; then, decompile it by executing >>> crushtool -d {comp-crushmap-filename} -o {decomp-crushmap-filename}. You >>> can view the decompiled map in a text editor or with cat. >>> The MDS Map: Contains the current MDS map epoch, when the map was >>> created, and the last time it changed. It also contains the pool for >>> storing metadata, a list of metadata servers, and which metadata servers >>> are up and in. To view an MDS map, execute ceph mds dump. >>> ``` >>> >>> As we don't use cephfs mds can essentially be blank(right) so I am left >>> with 4 valid maps needed to get a working cluster again. I don't see auth >>> mentioned in there but that too. Then I just need to rebuild the leveldb >>> database somehow with the right information and I should be good. So long >>> long long journey ahead. >>> >>> I don't think that the data is stored in strings or json, right? Am I >>> going down the wrong path here? Is there a shorter/simpler path to retrieve >>> the data from a cluster that lost all 3 monitors in power falure? If I am >>> going down the right path is there any advice on how I can assemble/repair >>> the database? >>> >>> I see that there is a rbd recovery from a dead cluster tool. Is it >>> possible to do the same with s3 objects? >>> >>> On Thu, Aug 11, 2016 at 11:15 AM, Wido den Hollander <w...@42on.com> >>> wrote: >>> >>>> >>>> > Op 11 augustus 2016 om 15:17 schreef Sean Sullivan < >>>> seapasu...@uchicago.edu>: >>>> > >>>> > >>>> > Hello Wido, >>>> > >>>> > Thanks for the advice. While the data center has a/b circuits and >>>> > redundant power, etc if a ground fault happens it travels outside and >>>> > fails causing the whole building to fail (apparently). >>>> > >>>> > The monitors are each the same with >>>> > 2x e5 cpus >>>> > 64gb of ram >>>> > 4x 300gb 10k SAS drives in raid 10 (write through mode). >>>> > Ubuntu 14.04 with the latest updates prior to power failure >>>> (2016/Aug/10 - >>>> > 3am CST) >>>> > Ceph hammer LTS 0.94.7 >>>> > >>>> > (we are still working on our jewel test cluster so it is planned but >>>> not in >>>> > place yet) >>>> > >>>> > The only thing that seems to be corrupt is the monitors leveldb >>>> store. I >>>> > see multiple issues on Google leveldb github from March 2016 about >>>> fsync >>>> > and power failure so I assume this is an issue with leveldb. >>>> > >>>> > I have backed up /var/lib/ceph/Mon on all of my monitors before >>>> trying to >>>> > proceed with any form of recovery. >>>> > >>>> > Is there any way to reconstruct the leveldb or replace the monitors >>>> and >>>> > recover the data? >>>> > >>>> I don't know. I have never done it. Other people might know this better >>>> than me. >>>> >>>> Maybe 'ceph-monstore-tool' can help you? >>>> >>>> Wido >>>> >>>> > I found the following post in which sage says it is tedious but >>>> possible. ( >>>> > http://www.spinics.net/lists/ceph-devel/msg06662.html). Tedious is >>>> fine if >>>> > I have any chance of doing it. I have the fsid, the Mon key map and >>>> all of >>>> > the osds look to be fine so all of the previous osd maps are there. >>>> > >>>> > I just don't understand what key/values I need inside. >>>> > >>>> > On Aug 11, 2016 1:33 AM, "Wido den Hollander" <w...@42on.com> wrote: >>>> > >>>> > > >>>> > > > Op 11 augustus 2016 om 0:10 schreef Sean Sullivan < >>>> > > seapasu...@uchicago.edu>: >>>> > > > >>>> > > > >>>> > > > I think it just got worse:: >>>> > > > >>>> > > > all three monitors on my other cluster say that ceph-mon can't >>>> open >>>> > > > /var/lib/ceph/mon/$(hostname). Is there any way to recover if you >>>> lose >>>> > > all >>>> > > > 3 monitors? I saw a post by Sage saying that the data can be >>>> recovered as >>>> > > > all of the data is held on other servers. Is this possible? If so >>>> has >>>> > > > anyone had any experience doing so? >>>> > > >>>> > > I have never done so, so I couldn't tell you. >>>> > > >>>> > > However, it is weird that on all three it got corrupted. What >>>> hardware are >>>> > > you using? Was it properly protected against power failure? >>>> > > >>>> > > If you mon store is corrupted I'm not sure what might happen. >>>> > > >>>> > > However, make a backup of ALL monitors right now before doing >>>> anything. >>>> > > >>>> > > Wido >>>> > > >>>> > > > _______________________________________________ >>>> > > > ceph-users mailing list >>>> > > > ceph-users@lists.ceph.com >>>> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> > > >>>> >>> >>> >>> >>> -- >>> - Sean: I wrote this. - >>> >> >> >> >> -- >> - Sean: I wrote this. - >> > > > > -- > - Sean: I wrote this. - > -- - Sean: I wrote this. -
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com