Re: [ceph-users] Fwd: lost power. monitors died. Cephx errors now

Sean Sullivan Mon, 06 Feb 2017 15:38:51 -0800

So the cluster has been dead and down since around 8/10/2016. I have since
rebooted the cluster in order to try and use the new ceph-monstore-tool
rebuild functionality.


I built the debian packages for the tools for hammer that were recently
backported and installed it across all of the servers:

root@kh08-8:/home/lacadmin# ceph --version
ceph version 0.94.9-4530-g83af8cd (83af8cdaaa6d94404e6146b68e532a784e3cc99c)


>From here I ran the following:
------------------------------------------------------------------------------
#!/bin/bash
set -e
store="/home/localadmin/monstore/"

rm -rf "${store}"
mkdir -p "${store}"



for host in kh{08..10}-{1..7};
do
    rsync -Pav ${store} ${host}:${store}
    for osd in $(ssh ${host} 'ls /var/lib/ceph/osd/ | grep ceph-*');
    do
        echo "${disk}"
        ssh ${host} "sudo ceph-objectstore-tool --data-path
/var/lib/ceph/osd/${osd} --journal-path /var/lib/ceph/osd/${osd}/journal
--op update-mon-db --mon-store-path ${store}"
    done
    ssh ${host} "sudo chown lacadmin. ${store}"
    rsync -Pav ${host}:${store} ${store}
done
------------------------------------------------------------------------------

Which generated a 1.1G store.db directory

>From here I ran the following (per the github guide --
https://github.com/ceph/ceph/blob/master/doc/rados/troubleshooting/troubleshooting-mon.rst
)

ceph-authtool ./admin.keyring -n mon. --cap mon 'allow *'
ceph-authtool -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap
mds 'allow *'

which gave me the following key ::
------------------------------------------------------------------------------
[mon.]
key = AAAAAAAAAAAAAAAA
caps mon = "allow *"
[client.admin]
key = AAAAAAAAAAAAAAAA
caps mds = "allow *"
caps mon = "allow *"
caps osd = "allow *"
------------------------------------------------------------------------------

the above looks like it shouldn't work but going with it. I tried using the
monstore tool to rebuild based on the monstore grabbed from all 630 of the
osds) but I am met with a dump T_T

------------------------------------------------------------------------------
ceph-monstore-tool /home/localadmin/monstore rebuild -- --keyring
/home/localadmin/admin.keyring

*** Caught signal (Segmentation fault) **
 in thread 7f10cd6d88c0
 ceph version 0.94.9-4530-g83af8cd
(83af8cdaaa6d94404e6146b68e532a784e3cc99c)
 1: ceph-monstore-tool() [0x5e960a]
 2: (()+0x10330) [0x7f10cc5c8330]
 3: (strlen()+0x2a) [0x7f10cac629da]
 4: (std::basic_string<char, std::char_traits<char>, std::allocator<char>
>::basic_string(char const*, std::allocator<char> const&)+0x25)
[0x7f10cb576d75]
 5: (rebuild_monstore(char const*, std::vector<std::string,
std::allocator<std::string> >&, MonitorDBStore&)+0x878) [0x544958]
 6: (main()+0x3e05) [0x52c035]
 7: (__libc_start_main()+0xf5) [0x7f10cabfbf45]
 8: ceph-monstore-tool() [0x540347]
2017-02-06 17:35:59.885651 7f10cd6d88c0 -1 *** Caught signal (Segmentation
fault) **
 in thread 7f10cd6d88c0

 ceph version 0.94.9-4530-g83af8cd
(83af8cdaaa6d94404e6146b68e532a784e3cc99c)
 1: ceph-monstore-tool() [0x5e960a]
 2: (()+0x10330) [0x7f10cc5c8330]
 3: (strlen()+0x2a) [0x7f10cac629da]
 4: (std::basic_string<char, std::char_traits<char>, std::allocator<char>
>::basic_string(char const*, std::allocator<char> const&)+0x25)
[0x7f10cb576d75]
 5: (rebuild_monstore(char const*, std::vector<std::string,
std::allocator<std::string> >&, MonitorDBStore&)+0x878) [0x544958]
 6: (main()+0x3e05) [0x52c035]
 7: (__libc_start_main()+0xf5) [0x7f10cabfbf45]
 8: ceph-monstore-tool() [0x540347]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed
to interpret this.

--- begin dump of recent events ---
   -15> 2017-02-06 17:35:54.362066 7f10cd6d88c0  5 asok(0x355a000)
register_command perfcounters_dump hook 0x350a0d0
   -14> 2017-02-06 17:35:54.362122 7f10cd6d88c0  5 asok(0x355a000)
register_command 1 hook 0x350a0d0
   -13> 2017-02-06 17:35:54.362137 7f10cd6d88c0  5 asok(0x355a000)
register_command perf dump hook 0x350a0d0
   -12> 2017-02-06 17:35:54.362147 7f10cd6d88c0  5 asok(0x355a000)
register_command perfcounters_schema hook 0x350a0d0
   -11> 2017-02-06 17:35:54.362157 7f10cd6d88c0  5 asok(0x355a000)
register_command 2 hook 0x350a0d0
   -10> 2017-02-06 17:35:54.362161 7f10cd6d88c0  5 asok(0x355a000)
register_command perf schema hook 0x350a0d0
    -9> 2017-02-06 17:35:54.362170 7f10cd6d88c0  5 asok(0x355a000)
register_command perf reset hook 0x350a0d0
    -8> 2017-02-06 17:35:54.362179 7f10cd6d88c0  5 asok(0x355a000)
register_command config show hook 0x350a0d0
    -7> 2017-02-06 17:35:54.362188 7f10cd6d88c0  5 asok(0x355a000)
register_command config set hook 0x350a0d0
    -6> 2017-02-06 17:35:54.362193 7f10cd6d88c0  5 asok(0x355a000)
register_command config get hook 0x350a0d0
    -5> 2017-02-06 17:35:54.362202 7f10cd6d88c0  5 asok(0x355a000)
register_command config diff hook 0x350a0d0
    -4> 2017-02-06 17:35:54.362207 7f10cd6d88c0  5 asok(0x355a000)
register_command log flush hook 0x350a0d0
    -3> 2017-02-06 17:35:54.362215 7f10cd6d88c0  5 asok(0x355a000)
register_command log dump hook 0x350a0d0
    -2> 2017-02-06 17:35:54.362220 7f10cd6d88c0  5 asok(0x355a000)
register_command log reopen hook 0x350a0d0
    -1> 2017-02-06 17:35:54.379684 7f10cd6d88c0  2 auth: KeyRing::load:
loaded key file /home/lacadmin/admin.keyring
     0> 2017-02-06 17:35:59.885651 7f10cd6d88c0 -1 *** Caught signal
(Segmentation fault) **
 in thread 7f10cd6d88c0

 ceph version 0.94.9-4530-g83af8cd
(83af8cdaaa6d94404e6146b68e532a784e3cc99c)
 1: ceph-monstore-tool() [0x5e960a]
 2: (()+0x10330) [0x7f10cc5c8330]
 3: (strlen()+0x2a) [0x7f10cac629da]
 4: (std::basic_string<char, std::char_traits<char>, std::allocator<char>
>::basic_string(char const*, std::allocator<char> const&)+0x25)
[0x7f10cb576d75]
 5: (rebuild_monstore(char const*, std::vector<std::string,
std::allocator<std::string> >&, MonitorDBStore&)+0x878) [0x544958]
 6: (main()+0x3e05) [0x52c035]
 7: (__libc_start_main()+0xf5) [0x7f10cabfbf45]
 8: ceph-monstore-tool() [0x540347]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed
to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 rbd_replay
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 keyvaluestore
   1/ 3 journal
   1/ 1 ms
  10/10 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/10 civetweb
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
   0/ 0 refs
   1/ 5 xio
  -2/-2 (syslog threshold)
  99/99 (stderr threshold)
  max_recent       500
  max_new         1000
  log_file
--- end dump of recent events ---
Segmentation fault (core dumped)

------------------------------------------------------------------------------

I have tried copying my monitor and admin keyring into the admin.keyring
used to try to rebuild and it still fails. I am not sure whether this is
due to my packages or if something else is wrong. Is there a way to test or
see what may be happening?


On Sat, Aug 13, 2016 at 10:36 PM, Sean Sullivan <seapasu...@uchicago.edu>
wrote:

> So with a patched leveldb to skip errors I now have a store.db that I can
> extract the pg,mon,and osd map from. That said when I try to start kh10-8
> it bombs out::
>
> ---------------------------------------
> ---------------------------------------
> root@kh10-8:/var/lib/ceph/mon/ceph-kh10-8# ceph-mon -i $(hostname) -d
> 2016-08-13 22:30:54.596039 7fa8b9e088c0  0 ceph version 0.94.7 (
> d56bdf93ced6b80b07397d57e3fa68fe68304432), process ceph-mon, pid 708653
> starting mon.kh10-8 rank 2 at 10.64.64.125:6789/0 mon_data
> /var/lib/ceph/mon/ceph-kh10-8 fsid e452874b-cb29-4468-ac7f-f8901dfccebf
> 2016-08-13 22:30:54.608150 7fa8b9e088c0  0 starting mon.kh10-8 rank 2 at
> 10.64.64.125:6789/0 mon_data /var/lib/ceph/mon/ceph-kh10-8 fsid
> e452874b-cb29-4468-ac7f-f8901dfccebf
> 2016-08-13 22:30:54.608395 7fa8b9e088c0  1 mon.kh10-8@-1(probing) e1
> preinit fsid e452874b-cb29-4468-ac7f-f8901dfccebf
> 2016-08-13 22:30:54.608617 7fa8b9e088c0  1 
> mon.kh10-8@-1(probing).paxosservice(pgmap
> 0..35606392) refresh upgraded, format 0 -> 1
> 2016-08-13 22:30:54.608629 7fa8b9e088c0  1 mon.kh10-8@-1(probing).pg v0
> on_upgrade discarding in-core PGMap
> terminate called after throwing an instance of
> 'ceph::buffer::end_of_buffer'
>   what():  buffer::end_of_buffer
> *** Caught signal (Aborted) **
>  in thread 7fa8b9e088c0
>  ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)
>  1: ceph-mon() [0x9b25ea]
>  2: (()+0x10330) [0x7fa8b8f0b330]
>  3: (gsignal()+0x37) [0x7fa8b73a8c37]
>  4: (abort()+0x148) [0x7fa8b73ac028]
>  5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7fa8b7cb3535]
>  6: (()+0x5e6d6) [0x7fa8b7cb16d6]
>  7: (()+0x5e703) [0x7fa8b7cb1703]
>  8: (()+0x5e922) [0x7fa8b7cb1922]
>  9: ceph-mon() [0x853c39]
>  10: (object_stat_collection_t::decode(ceph::buffer::list::iterator&)+0x167)
> [0x894227]
>  11: (pg_stat_t::decode(ceph::buffer::list::iterator&)+0x5ff) [0x894baf]
>  12: (PGMap::update_pg(pg_t, ceph::buffer::list&)+0xa3) [0x91a8d3]
>  13: (PGMonitor::read_pgmap_full()+0x1d8) [0x68b9b8]
>  14: (PGMonitor::update_from_paxos(bool*)+0xbf7) [0x6977b7]
>  15: (PaxosService::refresh(bool*)+0x19a) [0x605b5a]
>  16: (Monitor::refresh_from_paxos(bool*)+0x1db) [0x5b1ffb]
>  17: (Monitor::init_paxos()+0x85) [0x5b2365]
>  18: (Monitor::preinit()+0x7d7) [0x5b6f87]
>  19: (main()+0x230c) [0x57853c]
>  20: (__libc_start_main()+0xf5) [0x7fa8b7393f45]
>  21: ceph-mon() [0x59a3c7]
> 2016-08-13 22:30:54.611791 7fa8b9e088c0 -1 *** Caught signal (Aborted) **
>  in thread 7fa8b9e088c0
>
>  ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)
>  1: ceph-mon() [0x9b25ea]
>  2: (()+0x10330) [0x7fa8b8f0b330]
>  3: (gsignal()+0x37) [0x7fa8b73a8c37]
>  4: (abort()+0x148) [0x7fa8b73ac028]
>  5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7fa8b7cb3535]
>  6: (()+0x5e6d6) [0x7fa8b7cb16d6]
>  7: (()+0x5e703) [0x7fa8b7cb1703]
>  8: (()+0x5e922) [0x7fa8b7cb1922]
>  9: ceph-mon() [0x853c39]
>  10: (object_stat_collection_t::decode(ceph::buffer::list::iterator&)+0x167)
> [0x894227]
>  11: (pg_stat_t::decode(ceph::buffer::list::iterator&)+0x5ff) [0x894baf]
>  12: (PGMap::update_pg(pg_t, ceph::buffer::list&)+0xa3) [0x91a8d3]
>  13: (PGMonitor::read_pgmap_full()+0x1d8) [0x68b9b8]
>  14: (PGMonitor::update_from_paxos(bool*)+0xbf7) [0x6977b7]
>  15: (PaxosService::refresh(bool*)+0x19a) [0x605b5a]
>  16: (Monitor::refresh_from_paxos(bool*)+0x1db) [0x5b1ffb]
>  17: (Monitor::init_paxos()+0x85) [0x5b2365]
>  18: (Monitor::preinit()+0x7d7) [0x5b6f87]
>  19: (main()+0x230c) [0x57853c]
>  20: (__libc_start_main()+0xf5) [0x7fa8b7393f45]
>  21: ceph-mon() [0x59a3c7]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed
> to interpret this.
>
> --- begin dump of recent events ---
>    -33> 2016-08-13 22:30:54.593450 7fa8b9e088c0  5 asok(0x36a20f0)
> register_command perfcounters_dump hook 0x365a050
>    -32> 2016-08-13 22:30:54.593480 7fa8b9e088c0  5 asok(0x36a20f0)
> register_command 1 hook 0x365a050
>    -31> 2016-08-13 22:30:54.593486 7fa8b9e088c0  5 asok(0x36a20f0)
> register_command perf dump hook 0x365a050
>    -30> 2016-08-13 22:30:54.593496 7fa8b9e088c0  5 asok(0x36a20f0)
> register_command perfcounters_schema hook 0x365a050
>    -29> 2016-08-13 22:30:54.593499 7fa8b9e088c0  5 asok(0x36a20f0)
> register_command 2 hook 0x365a050
>    -28> 2016-08-13 22:30:54.593501 7fa8b9e088c0  5 asok(0x36a20f0)
> register_command perf schema hook 0x365a050
>    -27> 2016-08-13 22:30:54.593503 7fa8b9e088c0  5 asok(0x36a20f0)
> register_command perf reset hook 0x365a050
>    -26> 2016-08-13 22:30:54.593505 7fa8b9e088c0  5 asok(0x36a20f0)
> register_command config show hook 0x365a050
>    -25> 2016-08-13 22:30:54.593508 7fa8b9e088c0  5 asok(0x36a20f0)
> register_command config set hook 0x365a050
>    -24> 2016-08-13 22:30:54.593510 7fa8b9e088c0  5 asok(0x36a20f0)
> register_command config get hook 0x365a050
>    -23> 2016-08-13 22:30:54.593512 7fa8b9e088c0  5 asok(0x36a20f0)
> register_command config diff hook 0x365a050
>    -22> 2016-08-13 22:30:54.593513 7fa8b9e088c0  5 asok(0x36a20f0)
> register_command log flush hook 0x365a050
>    -21> 2016-08-13 22:30:54.593557 7fa8b9e088c0  5 asok(0x36a20f0)
> register_command log dump hook 0x365a050
>    -20> 2016-08-13 22:30:54.593561 7fa8b9e088c0  5 asok(0x36a20f0)
> register_command log reopen hook 0x365a050
>    -19> 2016-08-13 22:30:54.596039 7fa8b9e088c0  0 ceph version 0.94.7 (
> d56bdf93ced6b80b07397d57e3fa68fe68304432), process ceph-mon, pid 708653
>    -18> 2016-08-13 22:30:54.597587 7fa8b9e088c0  5 asok(0x36a20f0) init
> /var/run/ceph/ceph-mon.kh10-8.asok
>    -17> 2016-08-13 22:30:54.597601 7fa8b9e088c0  5 asok(0x36a20f0)
> bind_and_listen /var/run/ceph/ceph-mon.kh10-8.asok
>    -16> 2016-08-13 22:30:54.597767 7fa8b9e088c0  5 asok(0x36a20f0)
> register_command 0 hook 0x36560c0
>    -15> 2016-08-13 22:30:54.597775 7fa8b9e088c0  5 asok(0x36a20f0)
> register_command version hook 0x36560c0
>    -14> 2016-08-13 22:30:54.597778 7fa8b9e088c0  5 asok(0x36a20f0)
> register_command git_version hook 0x36560c0
>    -13> 2016-08-13 22:30:54.597781 7fa8b9e088c0  5 asok(0x36a20f0)
> register_command help hook 0x365a150
>    -12> 2016-08-13 22:30:54.597783 7fa8b9e088c0  5 asok(0x36a20f0)
> register_command get_command_descriptions hook 0x365a140
>    -11> 2016-08-13 22:30:54.597860 7fa8b5181700  5 asok(0x36a20f0) entry
> start
>    -10> 2016-08-13 22:30:54.608150 7fa8b9e088c0  0 starting mon.kh10-8
> rank 2 at 10.64.64.125:6789/0 mon_data /var/lib/ceph/mon/ceph-kh10-8 fsid
> e452874b-cb29-4468-ac7f-f8901dfccebf
>     -9> 2016-08-13 22:30:54.608210 7fa8b9e088c0  1 -- 10.64.64.125:6789/0
> learned my addr 10.64.64.125:6789/0
>     -8> 2016-08-13 22:30:54.608214 7fa8b9e088c0  1 accepter.accepter.bind
> my_inst.addr is 10.64.64.125:6789/0 need_addr=0
>     -7> 2016-08-13 22:30:54.608279 7fa8b9e088c0  5 adding auth protocol:
> cephx
>     -6> 2016-08-13 22:30:54.608282 7fa8b9e088c0  5 adding auth protocol:
> cephx
>     -5> 2016-08-13 22:30:54.608311 7fa8b9e088c0 10 log_channel(cluster)
> update_config to_monitors: true to_syslog: false syslog_facility: daemon
> prio: info)
>     -4> 2016-08-13 22:30:54.608317 7fa8b9e088c0 10 log_channel(audit)
> update_config to_monitors: true to_syslog: false syslog_facility: local0
> prio: info)
>     -3> 2016-08-13 22:30:54.608395 7fa8b9e088c0  1 mon.kh10-8@-1(probing)
> e1 preinit fsid e452874b-cb29-4468-ac7f-f8901dfccebf
>     -2> 2016-08-13 22:30:54.608617 7fa8b9e088c0  1 
> mon.kh10-8@-1(probing).paxosservice(pgmap
> 0..35606392) refresh upgraded, format 0 -> 1
>     -1> 2016-08-13 22:30:54.608629 7fa8b9e088c0  1 mon.kh10-8@-1(probing).pg
> v0 on_upgrade discarding in-core PGMap
>      0> 2016-08-13 22:30:54.611791 7fa8b9e088c0 -1 *** Caught signal
> (Aborted) **
>  in thread 7fa8b9e088c0
>
>  ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)
>  1: ceph-mon() [0x9b25ea]
>  2: (()+0x10330) [0x7fa8b8f0b330]
>  3: (gsignal()+0x37) [0x7fa8b73a8c37]
>  4: (abort()+0x148) [0x7fa8b73ac028]
>  5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7fa8b7cb3535]
>  6: (()+0x5e6d6) [0x7fa8b7cb16d6]
>  7: (()+0x5e703) [0x7fa8b7cb1703]
>  8: (()+0x5e922) [0x7fa8b7cb1922]
>  9: ceph-mon() [0x853c39]
>  10: (object_stat_collection_t::decode(ceph::buffer::list::iterator&)+0x167)
> [0x894227]
>  11: (pg_stat_t::decode(ceph::buffer::list::iterator&)+0x5ff) [0x894baf]
>  12: (PGMap::update_pg(pg_t, ceph::buffer::list&)+0xa3) [0x91a8d3]
>  13: (PGMonitor::read_pgmap_full()+0x1d8) [0x68b9b8]
>  14: (PGMonitor::update_from_paxos(bool*)+0xbf7) [0x6977b7]
>  15: (PaxosService::refresh(bool*)+0x19a) [0x605b5a]
>  16: (Monitor::refresh_from_paxos(bool*)+0x1db) [0x5b1ffb]
>  17: (Monitor::init_paxos()+0x85) [0x5b2365]
>  18: (Monitor::preinit()+0x7d7) [0x5b6f87]
>  19: (main()+0x230c) [0x57853c]
>  20: (__libc_start_main()+0xf5) [0x7fa8b7393f45]
>  21: ceph-mon() [0x59a3c7]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed
> to interpret this.
>
> --- logging levels ---
>    0/ 5 none
>    0/ 1 lockdep
>    0/ 1 context
>    1/ 1 crush
>    1/ 5 mds
>    1/ 5 mds_balancer
>    1/ 5 mds_locker
>    1/ 5 mds_log
>    1/ 5 mds_log_expire
>    1/ 5 mds_migrator
>    0/ 1 buffer
>    0/ 1 timer
>    0/ 1 filer
>    0/ 1 striper
>    0/ 1 objecter
>    0/ 5 rados
>    0/ 5 rbd
>    0/ 5 rbd_replay
>    0/ 5 journaler
>    0/ 5 objectcacher
>    0/ 5 client
>    0/ 5 osd
>    0/ 5 optracker
>    0/ 5 objclass
>    1/ 3 filestore
>    1/ 3 keyvaluestore
>    1/ 3 journal
>    0/ 5 ms
>    1/ 5 mon
>    0/10 monc
>    1/ 5 paxos
>    0/ 5 tp
>    1/ 5 auth
>    1/ 5 crypto
>    1/ 1 finisher
>    1/ 5 heartbeatmap
>    1/ 5 perfcounter
>    1/ 5 rgw
>    1/10 civetweb
>    1/ 5 javaclient
>    1/ 5 asok
>    1/ 1 throttle
>    0/ 0 refs
>    1/ 5 xio
>   -2/-2 (syslog threshold)
>   99/99 (stderr threshold)
>   max_recent     10000
>   max_new         1000
>   log_file
> --- end dump of recent events ---
> Aborted (core dumped)
> ---------------------------------------
> ---------------------------------------
>
> I feel like I am so close but so far. Can anyone give me a nudge as to
> what I can do next? it looks like it is bombing out on trying to get an
> updated paxos.
>
>
>
> On Fri, Aug 12, 2016 at 1:09 PM, Sean Sullivan <seapasu...@uchicago.edu>
> wrote:
>
>> A coworker patched leveldb and we were able to export quite a bit of data
>> from kh08's leveldb database. At this point I think I need to re-construct
>> a new leveldb with whatever values I can. Is it the same leveldb database
>> across all 3 montiors? IE will keys exported from one work in the other?
>> All should have the same keys/values although constructed differently
>> right? I can't blindly copy /var/lib/ceph/mon/ceph-$(hostname)/store.db/
>> from one host to another right? But can I copy the keys/values from one to
>> another?
>>
>> On Fri, Aug 12, 2016 at 12:45 PM, Sean Sullivan <seapasu...@uchicago.edu>
>> wrote:
>>
>>> ceph-monstore-tool? Is that the same as monmaptool? oops! NM found it in
>>> ceph-test package::
>>>
>>> I can't seem to get it working :-( dump monmap or any of the commands.
>>> They all bomb out with the same message:
>>>
>>> root@kh10-8:/var/lib/ceph/mon/ceph-kh10-8/store.db# ceph-monstore-tool
>>> /var/lib/ceph/mon/ceph-kh10-8 dump-trace -- /tmp/test.trace
>>> Corruption: 1 missing files; e.g.: /var/lib/ceph/mon/ceph-kh10-8/
>>> store.db/10882319.ldb
>>> root@kh10-8:/var/lib/ceph/mon/ceph-kh10-8/store.db# ceph-monstore-tool
>>> /var/lib/ceph/mon/ceph-kh10-8 dump-keys
>>> Corruption: 1 missing files; e.g.: /var/lib/ceph/mon/ceph-kh10-8/
>>> store.db/10882319.ldb
>>>
>>>
>>> I need to clarify as I originally had 2 clusters with this issue and now
>>> I have 1 with all 3 monitors dead and 1 that I was successfully able to
>>> repair. I am about to recap everything I know about the issue and the issue
>>> at hand. Should I start a new email thread about this instead?
>>>
>>> The cluster that is currently having issues is on hammer (94.7), and the
>>> monitor stats are the same::
>>> root@kh08-8:~# cat /proc/cpuinfo | grep -iE "model name" | uniq -c
>>>      24 model name : Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz
>>>      ext4 volume comprised of 4x300GB 10k drives in raid 10.
>>>      ubuntu 14.04
>>>
>>> root@kh08-8:~# uname -a
>>> Linux kh08-8 3.13.0-76-generic #120-Ubuntu SMP Mon Jan 18 15:59:10 UTC
>>> 2016 x86_64 x86_64 x86_64 GNU/Linux
>>> root@kh08-8:~# ceph --version
>>> ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)
>>>
>>>
>>> From here: Here are the errors I am getting when starting each of the
>>> monitors::
>>>
>>>
>>> ---------------
>>> root@kh08-8:~# /usr/bin/ceph-mon --cluster=ceph -i kh08-8 -d
>>> 2016-08-11 22:15:23.731550 7fe5ad3e98c0  0 ceph version 0.94.7
>>> (d56bdf93ced6b80b07397d57e3fa68fe68304432), process ceph-mon, pid 317309
>>> Corruption: error in middle of record
>>> 2016-08-11 22:15:28.274340 7fe5ad3e98c0 -1 error opening mon data
>>> directory at '/var/lib/ceph/mon/ceph-kh08-8': (22) Invalid argument
>>> --
>>> root@kh09-8:~# /usr/bin/ceph-mon --cluster=ceph -i kh09-8 -d
>>> 2016-08-11 22:14:28.252370 7f7eaab908c0  0 ceph version 0.94.7
>>> (d56bdf93ced6b80b07397d57e3fa68fe68304432), process ceph-mon, pid 308888
>>> Corruption: 14 missing files; e.g.: /var/lib/ceph/mon/ceph-kh09-8/
>>> store.db/10845998.ldb
>>> 2016-08-11 22:14:35.094237 7f7eaab908c0 -1 error opening mon data
>>> directory at '/var/lib/ceph/mon/ceph-kh09-8': (22) Invalid argument
>>> --
>>> root@kh10-8:/var/lib/ceph/mon/ceph-kh10-8/store.db# /usr/bin/ceph-mon
>>> --cluster=ceph -i kh10-8 -d
>>> 2016-08-11 22:17:54.632762 7f80bf34d8c0  0 ceph version 0.94.7
>>> (d56bdf93ced6b80b07397d57e3fa68fe68304432), process ceph-mon, pid 292620
>>> Corruption: 1 missing files; e.g.: /var/lib/ceph/mon/ceph-kh10-8/
>>> store.db/10882319.ldb
>>> 2016-08-11 22:18:01.207749 7f80bf34d8c0 -1 error opening mon data
>>> directory at '/var/lib/ceph/mon/ceph-kh10-8': (22) Invalid argument
>>> ---------------
>>>
>>>
>>> for kh08, a coworker patched leveldb to print and skip on the first
>>> error and that one is also missing a bunch of files. As such I think kh10-8
>>> is my most likely candidate to recover but either way recovery is probably
>>> not an option. I see leveldb has a repair.cc (
>>> https://github.com/google/leveldb/blob/master/db/repair.cc)) but I do
>>> not see repair mentioned in monitor in respect to the dbstore. I tried
>>> using the leveldb python module (plyvel) to attempt a repair but my repl
>>> just ends up dying.
>>>
>>> I understand two things:: 1.) Without rebuilding the monitor backend
>>> leveldb (the cluster map as I understand it) store all of the data in the
>>> cluster is essentialy lost (right?)
>>>                                          2.) it is possible to rebuild
>>> this database via some form of magic or (source)ry as all of this data is
>>> essential held throughout the cluster as well.
>>>
>>> We only use radosgw / S3 for this cluster. If there is a way to recover
>>> my data that is easier//more likely than rebuilding the leveldb of a
>>> monitor and starting a single monitor cluster up I would like to switch
>>> gears and focus on that.
>>>
>>> Looking at the dev docs:
>>> http://docs.ceph.com/docs/hammer/architecture/#cluster-map
>>> it has 5 main parts::
>>>
>>> ```
>>> The Monitor Map: Contains the cluster fsid, the position, name address
>>> and port of each monitor. It also indicates the current epoch, when the map
>>> was created, and the last time it changed. To view a monitor map, execute
>>> ceph mon dump.
>>> The OSD Map: Contains the cluster fsid, when the map was created and
>>> last modified, a list of pools, replica sizes, PG numbers, a list of OSDs
>>> and their status (e.g., up, in). To view an OSD map, execute ceph osd dump.
>>> The PG Map: Contains the PG version, its time stamp, the last OSD map
>>> epoch, the full ratios, and details on each placement group such as the PG
>>> ID, the Up Set, the Acting Set, the state of the PG (e.g., active + clean),
>>> and data usage statistics for each pool.
>>> The CRUSH Map: Contains a list of storage devices, the failure domain
>>> hierarchy (e.g., device, host, rack, row, room, etc.), and rules for
>>> traversing the hierarchy when storing data. To view a CRUSH map, execute
>>> ceph osd getcrushmap -o {filename}; then, decompile it by executing
>>> crushtool -d {comp-crushmap-filename} -o {decomp-crushmap-filename}. You
>>> can view the decompiled map in a text editor or with cat.
>>> The MDS Map: Contains the current MDS map epoch, when the map was
>>> created, and the last time it changed. It also contains the pool for
>>> storing metadata, a list of metadata servers, and which metadata servers
>>> are up and in. To view an MDS map, execute ceph mds dump.
>>> ```
>>>
>>> As we don't use cephfs mds can essentially be blank(right) so I am left
>>> with 4 valid maps needed to get a working cluster again. I don't see auth
>>> mentioned in there but that too.  Then I just need to rebuild the leveldb
>>> database somehow with the right information and I should be good. So long
>>> long long journey ahead.
>>>
>>> I don't think that the data is stored in strings or json, right? Am I
>>> going down the wrong path here? Is there a shorter/simpler path to retrieve
>>> the data from a cluster that lost all 3 monitors in power falure? If I am
>>> going down the right path is there any advice on how I can assemble/repair
>>> the database?
>>>
>>> I see that there is a rbd recovery from a dead cluster tool. Is it
>>> possible to do the same with s3 objects?
>>>
>>> On Thu, Aug 11, 2016 at 11:15 AM, Wido den Hollander <w...@42on.com>
>>> wrote:
>>>
>>>>
>>>> > Op 11 augustus 2016 om 15:17 schreef Sean Sullivan <
>>>> seapasu...@uchicago.edu>:
>>>> >
>>>> >
>>>> > Hello Wido,
>>>> >
>>>> > Thanks for the advice.  While the data center has a/b circuits and
>>>> > redundant power, etc if a ground fault happens it  travels outside and
>>>> > fails causing the whole building to fail (apparently).
>>>> >
>>>> > The monitors are each the same with
>>>> > 2x e5 cpus
>>>> > 64gb of ram
>>>> > 4x 300gb 10k SAS drives in raid 10 (write through mode).
>>>> > Ubuntu 14.04 with the latest updates prior to power failure
>>>> (2016/Aug/10 -
>>>> > 3am CST)
>>>> > Ceph hammer LTS 0.94.7
>>>> >
>>>> > (we are still working on our jewel test cluster so it is planned but
>>>> not in
>>>> > place yet)
>>>> >
>>>> > The only thing that seems to be corrupt is the monitors leveldb
>>>> store.  I
>>>> > see multiple issues on Google leveldb github from March 2016 about
>>>> fsync
>>>> > and power failure so I assume this is an issue with leveldb.
>>>> >
>>>> > I have backed up /var/lib/ceph/Mon on all of my monitors before
>>>> trying to
>>>> > proceed with any form of recovery.
>>>> >
>>>> > Is there any way to reconstruct the leveldb or replace the monitors
>>>> and
>>>> > recover the data?
>>>> >
>>>> I don't know. I have never done it. Other people might know this better
>>>> than me.
>>>>
>>>> Maybe 'ceph-monstore-tool' can help you?
>>>>
>>>> Wido
>>>>
>>>> > I found the following post in which sage says it is tedious but
>>>> possible. (
>>>> > http://www.spinics.net/lists/ceph-devel/msg06662.html). Tedious is
>>>> fine if
>>>> > I have any chance of doing it.  I have the fsid, the Mon key map and
>>>> all of
>>>> > the osds look to be fine so all of the previous osd maps  are there.
>>>> >
>>>> > I just don't understand what key/values I need inside.
>>>> >
>>>> > On Aug 11, 2016 1:33 AM, "Wido den Hollander" <w...@42on.com> wrote:
>>>> >
>>>> > >
>>>> > > > Op 11 augustus 2016 om 0:10 schreef Sean Sullivan <
>>>> > > seapasu...@uchicago.edu>:
>>>> > > >
>>>> > > >
>>>> > > > I think it just got worse::
>>>> > > >
>>>> > > > all three monitors on my other cluster say that ceph-mon can't
>>>> open
>>>> > > > /var/lib/ceph/mon/$(hostname). Is there any way to recover if you
>>>> lose
>>>> > > all
>>>> > > > 3 monitors? I saw a post by Sage saying that the data can be
>>>> recovered as
>>>> > > > all of the data is held on other servers. Is this possible? If so
>>>> has
>>>> > > > anyone had any experience doing so?
>>>> > >
>>>> > > I have never done so, so I couldn't tell you.
>>>> > >
>>>> > > However, it is weird that on all three it got corrupted. What
>>>> hardware are
>>>> > > you using? Was it properly protected against power failure?
>>>> > >
>>>> > > If you mon store is corrupted I'm not sure what might happen.
>>>> > >
>>>> > > However, make a backup of ALL monitors right now before doing
>>>> anything.
>>>> > >
>>>> > > Wido
>>>> > >
>>>> > > > _______________________________________________
>>>> > > > ceph-users mailing list
>>>> > > > ceph-users@lists.ceph.com
>>>> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>> > >
>>>>
>>>
>>>
>>>
>>> --
>>> - Sean:  I wrote this. -
>>>
>>
>>
>>
>> --
>> - Sean:  I wrote this. -
>>
>
>
>
> --
> - Sean:  I wrote this. -
>



-- 
- Sean:  I wrote this. -

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Fwd: lost power. monitors died. Cephx errors now

Reply via email to