I am trying to setup a Ceph cluster on 4 odroid-hc2 instances on top of Ubuntu 18.04.
My ceph-mgr deamon keeps crashing on me. Any advise on how to proceed? Log on mgr node says something about ms_dispatch: 2019-05-20 15:34:43.070424 b6714230 0 set uid:gid to 64045:64045 (ceph:ceph) 2019-05-20 15:34:43.070455 b6714230 0 ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0b a30ea23eee) luminous (stable), process ceph-mgr, pid 1169 2019-05-20 15:34:43.070799 b6714230 0 pidfile_write: ignore empty --pid-file 2019-05-20 15:34:43.101162 b6714230 1 mgr send_beacon standby 2019-05-20 15:34:43.124462 b06f8c30 -1 *** Caught signal (Segmentation fault) ** in thread b06f8c30 thread_name:ms_dispatch ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable) 1: (()+0x30133c) [0x77033c] 2: (()+0x25750) [0xb688a750] 3: (_ULarm_step()+0x55) [0xb6816ce6] 4: (()+0x255e8) [0xb6cd85e8] 5: (GetStackTrace(void**, int, int)+0x25) [0xb6cd8a3e] 6: (tcmalloc::PageHeap::GrowHeap(unsigned int)+0xb9) [0xb6ccd36a] 7: (tcmalloc::PageHeap::New(unsigned int)+0x79) [0xb6ccd5e6] 8: (tcmalloc::CentralFreeList::Populate()+0x71) [0xb6ccc5ce] 9: (tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**)+0x1b) [0xb6ccc76 0] 10: (tcmalloc::CentralFreeList::RemoveRange(void**, void**, int)+0x6d) [0xb6ccc7de] 11: (tcmalloc::ThreadCache::FetchFromCentralCache(unsigned int, unsigned int)+0x51) [0xb6c cea56] 12: (malloc()+0x22d) [0xb6cd9a8e] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this . --- begin dump of recent events --- -90> 2019-05-20 15:34:43.053293 b6714230 5 asok(0x55b5320) register_command perfcounter s_dump hook 0x554c088 -89> 2019-05-20 15:34:43.053322 b6714230 5 asok(0x55b5320) register_command 1 hook 0x55 4c088 -88> 2019-05-20 15:34:43.053330 b6714230 5 asok(0x55b5320) register_command perf dump h ook 0x554c088 -87> 2019-05-20 15:34:43.053341 b6714230 5 asok(0x55b5320) register_command perfcounter s_schema hook 0x554c088 -86> 2019-05-20 15:34:43.053360 b6714230 5 asok(0x55b5320) register_command perf histog ram dump hook 0x554c088 -85> 2019-05-20 15:34:43.053374 b6714230 5 asok(0x55b5320) register_command 2 hook 0x55 4c088 -84> 2019-05-20 15:34:43.053381 b6714230 5 asok(0x55b5320) register_command perf schema hook 0x554c088 -83> 2019-05-20 15:34:43.053389 b6714230 5 asok(0x55b5320) register_command perf histog ram schema hook 0x554c088 -82> 2019-05-20 15:34:43.053410 b6714230 5 asok(0x55b5320) register_command perf reset hook 0x554c088 -81> 2019-05-20 15:34:43.053418 b6714230 5 asok(0x55b5320) register_command config show hook 0x554c088 -80> 2019-05-20 15:34:43.053425 b6714230 5 asok(0x55b5320) register_command config help hook 0x554c088 -79> 2019-05-20 15:34:43.053436 b6714230 5 asok(0x55b5320) register_command config set hook 0x554c088 -78> 2019-05-20 15:34:43.053444 b6714230 5 asok(0x55b5320) register_command config get hook 0x554c088 -77> 2019-05-20 15:34:43.053459 b6714230 5 asok(0x55b5320) register_command config diff hook 0x554c088 -76> 2019-05-20 15:34:43.053467 b6714230 5 asok(0x55b5320) register_command config diff get hook 0x554c088 -75> 2019-05-20 15:34:43.053475 b6714230 5 asok(0x55b5320) register_command log flush h ook 0x554c088 -74> 2019-05-20 15:34:43.053482 b6714230 5 asok(0x55b5320) register_command log dump ho ok 0x554c088 -73> 2019-05-20 15:34:43.053490 b6714230 5 asok(0x55b5320) register_command log reopen hook 0x554c088 -72> 2019-05-20 15:34:43.053513 b6714230 5 asok(0x55b5320) register_command dump_mempoo ls hook 0x56e3504 -71> 2019-05-20 15:34:43.070424 b6714230 0 set uid:gid to 64045:64045 (ceph:ceph) -70> 2019-05-20 15:34:43.070455 b6714230 0 ceph version 12.2.11 (26dc3775efc7bb286a1d6d 66faee0ba30ea23eee) luminous (stable), process ceph-mgr, pid 1169 -69> 2019-05-20 15:34:43.070799 b6714230 0 pidfile_write: ignore empty --pid-file -68> 2019-05-20 15:34:43.074441 b6714230 5 asok(0x55b5320) init /var/run/ceph/ceph-mgr. odroid-c.asok -67> 2019-05-20 15:34:43.074473 b6714230 5 asok(0x55b5320) bind_and_listen /var/run/cep h/ceph-mgr.odroid-c.asok -66> 2019-05-20 15:34:43.074615 b6714230 5 asok(0x55b5320) register_command 0 hook 0x55 4c1d0 -65> 2019-05-20 15:34:43.074633 b6714230 5 asok(0x55b5320) register_command version hoo k 0x554c1d0 -64> 2019-05-20 15:34:43.074654 b6714230 5 asok(0x55b5320) register_command git_version hook 0x554c1d0 -63> 2019-05-20 15:34:43.074674 b6714230 5 asok(0x55b5320) register_command help hook 0 x554c1d8 -62> 2019-05-20 15:34:43.074694 b6714230 5 asok(0x55b5320) register_command get_command _descriptions hook 0x554c1e0 -61> 2019-05-20 15:34:43.074785 b3effc30 5 asok(0x55b5320) entry start -60> 2019-05-20 15:34:43.076464 b36fec30 2 Event(0x554e068 nevent=5000 time_id=1).set_o wner idx=0 owner=3010456624 -59> 2019-05-20 15:34:43.076559 b2efdc30 2 Event(0x554e488 nevent=5000 time_id=1).set_o wner idx=1 owner=3002063920 -58> 2019-05-20 15:34:43.076643 b26fcc30 2 Event(0x554e1c8 nevent=5000 time_id=1).set_o wner idx=2 owner=2993671216 -57> 2019-05-20 15:34:43.077177 b6714230 1 Processor -- start -56> 2019-05-20 15:34:43.077298 b6714230 1 -- - start start -55> 2019-05-20 15:34:43.077315 b6714230 10 monclient: build_initial_monmap -54> 2019-05-20 15:34:43.077362 b6714230 10 monclient: init -53> 2019-05-20 15:34:43.077380 b6714230 5 adding auth protocol: cephx -52> 2019-05-20 15:34:43.077391 b6714230 10 monclient: auth_supported 2 method cephx -51> 2019-05-20 15:34:43.077625 b6714230 2 auth: KeyRing::load: loaded key file /var/li b/ceph/mgr/ceph-odroid-c/keyring -50> 2019-05-20 15:34:43.077761 b6714230 10 monclient: _reopen_session rank -1 -49> 2019-05-20 15:34:43.077847 b6714230 10 monclient(hunting): picked mon.noname-a con 0x5792d00 addr 192.168.130.131:6789/0 -48> 2019-05-20 15:34:43.077899 b6714230 1 -- - --> 192.168.130.131:6789/0 -- auth(prot o 0 33 bytes epoch 0) v1 -- 0x5590680 con 0 -47> 2019-05-20 15:34:43.077985 b6714230 10 monclient(hunting): _renew_subs -46> 2019-05-20 15:34:43.080980 b2efdc30 1 -- 192.168.130.132:0/2049423493 learned_addr learned my addr 192.168.130.132:0/2049423493 -45> 2019-05-20 15:34:43.082020 b2efdc30 2 -- 192.168.130.132:0/2049423493 >> 192.168.1 30.131:6789/0 conn(0x5792d00 :-1 s=STATE_CONNECTING_WAIT_ACK_SEQ pgs=0 cs=0 l=0)._process_c onnection got newly_acked_seq 0 vs out_seq 0 -44> 2019-05-20 15:34:43.084528 b2efdc30 5 -- 192.168.130.132:0/2049423493 >> 192.168.1 30.131:6789/0 conn(0x5792d00 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1 l=1). rx mon.0 seq 1 0x55aa900 mon_map magic: 0 v1 -43> 2019-05-20 15:34:43.084615 b06f8c30 1 -- 192.168.130.132:0/2049423493 <== mon.0 19 2.168.130.131:6789/0 1 ==== mon_map magic: 0 v1 ==== 196+0+0 (1694575244 0 0) 0x55aa900 con 0x5792d00 -42> 2019-05-20 15:34:43.084656 b06f8c30 10 monclient(hunting): handle_monmap mon_map ma gic: 0 v1 -41> 2019-05-20 15:34:43.084685 b06f8c30 10 monclient(hunting): got monmap 1, mon.nonam e-a is now rank -1 -40> 2019-05-20 15:34:43.084698 b06f8c30 10 monclient(hunting): dump: epoch 1 fsid 75cb9a2d-673b-4a32-897a-05470a08ed58 last_changed 2019-05-20 15:02:53.998735 created 2019-05-20 15:02:53.998735 0: 192.168.130.131:6789/0 mon.odroid-b -39> 2019-05-20 15:34:43.084956 b2efdc30 5 -- 192.168.130.132:0/2049423493 >> 192.168.1 30.131:6789/0 conn(0x5792d00 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1 l=1). rx mon.0 seq 2 0x55a0540 auth_reply(proto 2 0 (0) Success) v1 -38> 2019-05-20 15:34:43.085011 b06f8c30 1 -- 192.168.130.132:0/2049423493 <== mon.0 19 2.168.130.131:6789/0 2 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 33+0+0 (4086221156 0 0) 0x55a0540 con 0x5792d00 -37> 2019-05-20 15:34:43.085053 b06f8c30 10 monclient(hunting): my global_id is 24139 -36> 2019-05-20 15:34:43.085175 b06f8c30 1 -- 192.168.130.132:0/2049423493 --> 192.168. 130.131:6789/0 -- auth(proto 2 32 bytes epoch 0) v1 -- 0x5590d00 con 0 -35> 2019-05-20 15:34:43.088488 b2efdc30 5 -- 192.168.130.132:0/2049423493 >> 192.168.1 30.131:6789/0 conn(0x5792d00 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1 l=1). rx mon.0 seq 3 0x55a0700 auth_reply(proto 2 0 (0) Success) v1 -34> 2019-05-20 15:34:43.088712 b06f8c30 1 -- 192.168.130.132:0/2049423493 <== mon.0 19 2.168.130.131:6789/0 3 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 222+0+0 (1945430716 0 0) 0x55a0700 con 0x5792d00 -33> 2019-05-20 15:34:43.089295 b06f8c30 1 -- 192.168.130.132:0/2049423493 --> 192.168. 130.131:6789/0 -- auth(proto 2 181 bytes epoch 0) v1 -- 0x5590680 con 0 -32> 2019-05-20 15:34:43.097488 b2efdc30 5 -- 192.168.130.132:0/2049423493 >> 192.168.1 30.131:6789/0 conn(0x5792d00 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1 l=1). rx mon.0 seq 4 0x55a08c0 auth_reply(proto 2 0 (0) Success) v1 -31> 2019-05-20 15:34:43.097643 b06f8c30 1 -- 192.168.130.132:0/2049423493 <== mon.0 19 2.168.130.131:6789/0 4 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 783+0+0 (327382700 0 0) 0x55a08c0 con 0x5792d00 -30> 2019-05-20 15:34:43.098725 b06f8c30 1 monclient: found mon.odroid-b -29> 2019-05-20 15:34:43.098850 b06f8c30 10 monclient: _send_mon_message to mon.odroid-b at 192.168.130.131:6789/0 -28> 2019-05-20 15:34:43.098898 b06f8c30 1 -- 192.168.130.132:0/2049423493 --> 192.168. 130.131:6789/0 -- mon_subscribe({mgrmap=0+,monmap=0+}) v2 -- 0x554eb00 con 0 -27> 2019-05-20 15:34:43.099042 b06f8c30 10 monclient: _check_auth_rotating renewing rot ating keys (they expired before 2019-05-20 15:34:13.099036) -26> 2019-05-20 15:34:43.099183 b06f8c30 10 monclient: _send_mon_message to mon.odroid-b at 192.168.130.131:6789/0 -25> 2019-05-20 15:34:43.099271 b06f8c30 1 -- 192.168.130.132:0/2049423493 --> 192.168. 130.131:6789/0 -- auth(proto 2 2 bytes epoch 0) v1 -- 0x5590d00 con 0 -24> 2019-05-20 15:34:43.099404 b6714230 5 monclient: authenticate success, global_id 2 4139 -23> 2019-05-20 15:34:43.099543 b6714230 10 log_channel(cluster) update_config to_monito rs: true to_syslog: false syslog_facility: daemon prio: info to_graylog: false graylog_host : 127.0.0.1 graylog_port: 12201) -22> 2019-05-20 15:34:43.099602 b6714230 10 log_channel(audit) update_config to_monitors : true to_syslog: false syslog_facility: local0 prio: info to_graylog: false graylog_host: 127.0.0.1 graylog_port: 12201) -21> 2019-05-20 15:34:43.099970 b6714230 5 asok(0x55b5320) register_command objecter_re quests hook 0x554c238 -20> 2019-05-20 15:34:43.100171 b6714230 10 monclient: _renew_subs -19> 2019-05-20 15:34:43.100214 b6714230 10 monclient: _send_mon_message to mon.odroid-b at 192.168.130.131:6789/0 -18> 2019-05-20 15:34:43.100246 b6714230 1 -- 192.168.130.132:0/2049423493 --> 192.168. 130.131:6789/0 -- mon_subscribe({osdmap=0}) v2 -- 0x554ec60 con 0 -17> 2019-05-20 15:34:43.100737 b6714230 5 asok(0x55b5320) register_command mds_request s hook 0xbefefe80 -16> 2019-05-20 15:34:43.100793 b6714230 5 asok(0x55b5320) register_command mds_session s hook 0xbefefe80 -15> 2019-05-20 15:34:43.100847 b6714230 5 asok(0x55b5320) register_command dump_cache hook 0xbefefe80 -14> 2019-05-20 15:34:43.100811 b2efdc30 5 -- 192.168.130.132:0/2049423493 >> 192.168.1 30.131:6789/0 conn(0x5792d00 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1 l=1). rx mon.0 seq 5 0x558dc00 mgrmap(e 99) v1 -13> 2019-05-20 15:34:43.100915 b6714230 5 asok(0x55b5320) register_command kick_stale_ sessions hook 0xbefefe80 -12> 2019-05-20 15:34:43.100977 b6714230 5 asok(0x55b5320) register_command status hook 0xbefefe80 -11> 2019-05-20 15:34:43.100987 b06f8c30 1 -- 192.168.130.132:0/2049423493 <== mon.0 19 2.168.130.131:6789/0 5 ==== mgrmap(e 99) v1 ==== 232+0+0 (4078310027 0 0) 0x558dc00 con 0x5 792d00 -10> 2019-05-20 15:34:43.101004 b2efdc30 5 -- 192.168.130.132:0/2049423493 >> 192.168.1 30.131:6789/0 conn(0x5792d00 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1 l=1). rx mon.0 seq 6 0x55aaa80 mon_map magic: 0 v1 -9> 2019-05-20 15:34:43.101162 b6714230 1 mgr send_beacon standby -8> 2019-05-20 15:34:43.101575 b2efdc30 5 -- 192.168.130.132:0/2049423493 >> 192.168.1 30.131:6789/0 conn(0x5792d00 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1 l=1). rx mon.0 seq 7 0x55a0540 auth_reply(proto 2 0 (0) Success) v1 -7> 2019-05-20 15:34:43.101889 b2efdc30 5 -- 192.168.130.132:0/2049423493 >> 192.168.1 30.131:6789/0 conn(0x5792d00 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1 l=1). rx mon.0 seq 8 0x5590d00 osd_map(42..42 src has 1..42) v3 -6> 2019-05-20 15:34:43.102775 b6714230 10 monclient: _send_mon_message to mon.odroid-b at 192.168.130.131:6789/0 -5> 2019-05-20 15:34:43.102838 b6714230 1 -- 192.168.130.132:0/2049423493 --> 192.168. 130.131:6789/0 -- mgrbeacon mgr.odroid-c(75cb9a2d-673b-4a32-897a-05470a08ed58,24139, -, 0) v6 -- 0x5562400 con 0 -4> 2019-05-20 15:34:43.102991 b6714230 4 mgr init Complete. -3> 2019-05-20 15:34:43.103065 b06f8c30 4 mgr ms_dispatch standby mgrmap(e 99) v1 -2> 2019-05-20 15:34:43.103110 b06f8c30 4 mgr handle_mgr_map received map epoch 99 -1> 2019-05-20 15:34:43.103128 b06f8c30 4 mgr handle_mgr_map active in map: 0 active i s 24134 0> 2019-05-20 15:34:43.124462 b06f8c30 -1 *** Caught signal (Segmentation fault) ** in thread b06f8c30 thread_name:ms_dispatch ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable) 1: (()+0x30133c) [0x77033c] 2: (()+0x25750) [0xb688a750] 3: (_ULarm_step()+0x55) [0xb6816ce6] 4: (()+0x255e8) [0xb6cd85e8] 5: (GetStackTrace(void**, int, int)+0x25) [0xb6cd8a3e] 6: (tcmalloc::PageHeap::GrowHeap(unsigned int)+0xb9) [0xb6ccd36a] 7: (tcmalloc::PageHeap::New(unsigned int)+0x79) [0xb6ccd5e6] 8: (tcmalloc::CentralFreeList::Populate()+0x71) [0xb6ccc5ce] 9: (tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**)+0x1b) [0xb6ccc76 0] 10: (tcmalloc::CentralFreeList::RemoveRange(void**, void**, int)+0x6d) [0xb6ccc7de] 11: (tcmalloc::ThreadCache::FetchFromCentralCache(unsigned int, unsigned int)+0x51) [0xb6c cea56] 12: (malloc()+0x22d) [0xb6cd9a8e] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this . --- logging levels --- 0/ 5 none 0/ 1 lockdep 0/ 1 context 1/ 1 crush 1/ 5 mds 1/ 5 mds_balancer 1/ 5 mds_locker 1/ 5 mds_log 1/ 5 mds_log_expire 1/ 5 mds_migrator 0/ 1 buffer 0/ 1 timer 0/ 1 filer 0/ 1 striper 0/ 1 objecter 0/ 5 rados 0/ 5 rbd 0/ 5 rbd_mirror 0/ 5 rbd_replay 0/ 5 journaler 0/ 5 objectcacher 0/ 5 client 1/ 5 osd 0/ 5 optracker 0/ 5 objclass 1/ 3 filestore 1/ 3 journal 0/ 5 ms 1/ 5 mon 0/10 monc 1/ 5 paxos 0/ 5 tp 1/ 5 auth 1/ 5 crypto 1/ 1 finisher 1/ 1 reserver 1/ 5 heartbeatmap 1/ 5 perfcounter 1/ 5 rgw 1/10 civetweb 1/ 5 javaclient 1/ 5 asok 1/ 1 throttle 0/ 0 refs 1/ 5 xio 1/ 5 compressor 1/ 5 bluestore 1/ 5 bluefs 1/ 3 bdev 1/ 5 kstore 4/ 5 rocksdb 4/ 5 leveldb 4/ 5 memdb 1/ 5 kinetic 1/ 5 fuse 1/ 5 mgr 1/ 5 mgrc 1/ 5 dpdk 1/ 5 eventtrace -2/-2 (syslog threshold) -1/-1 (stderr threshold) max_recent 10000 max_new 1000 log_file /var/log/ceph/ceph-mgr.odroid-c.log --- end dump of recent events --- Kind regards Jesper
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com