[ceph-users] Unable to join additional mon servers (luminous)
Hello, I'm running a ceph-12.2.2 cluster on debian/stretch with three mon servers, unsuccessfully trying to add another (or two additional) mon servers. While the new mon server keeps in state "synchronizing", the old mon servers get out of quorum, endlessly changing state from "peon" to "electing" or "probing", and eventually back to "peon" or "leader". On a small test cluster everthing works as expected, the new mons painlessly join the cluster. But on my production cluster I always run into trouble, both with ceph-deploy and manual intervention. Probably I'm missing some fundamental factor. Maybe anyone can give me a hint? These are the existing mons: my-ceph-mon-3: IP AAA.BBB.CCC.23 my-ceph-mon-4: IP AAA.BBB.CCC.24 my-ceph-mon-5: IP AAA.BBB.CCC.25 Trying to add my-ceph-mon-1: IP AAA.BBB.CCC.31 Here is a (hopefully) relevant and representative part of the logs on my-ceph-mon-5 when my-ceph-mon-1 tries to join: 2018-01-11 15:16:08.340741 7f69ba8db700 0 mon.my-ceph-mon-5@2(peon).data_health(6128) update_stats avail 57% total 19548 MB, used 8411 MB, avail 11149 MB 2018-01-11 15:16:16.830566 7f69b48cf700 0 -- AAA.BBB.CCC.18:6789/0 >> AAA.BBB.CCC.31:6789/0 conn(0x55d19cac2000 :6789 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 0 vs existing csq=1 existing_state=STATE_STANDBY 2018-01-11 15:16:16.830582 7f69b48cf700 0 -- AAA.BBB.CCC.18:6789/0 >> AAA.BBB.CCC.31:6789/0 conn(0x55d19cac2000 :6789 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept peer reset, then tried to connect to us, replacing 2018-01-11 15:16:16.831864 7f69b80d6700 1 mon.my-ceph-mon-5@2(peon) e15 adding peer AAA.BBB.CCC.31:6789/0 to list of hints 2018-01-11 15:16:16.833701 7f69b50d0700 0 -- AAA.BBB.CCC.18:6789/0 >> AAA.BBB.CCC.31:6789/0 conn(0x55d19c8ca000 :6789 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 0 vs existing csq=1 existing_state=STATE_STANDBY 2018-01-11 15:16:16.833713 7f69b50d0700 0 -- AAA.BBB.CCC.18:6789/0 >> AAA.BBB.CCC.31:6789/0 conn(0x55d19c8ca000 :6789 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept peer reset, then tried to connect to us, replacing 2018-01-11 15:16:16.834843 7f69b80d6700 1 mon.my-ceph-mon-5@2(peon) e15 adding peer AAA.BBB.CCC.31:6789/0 to list of hints 2018-01-11 15:16:35.907962 7f69ba8db700 1 mon.my-ceph-mon-5@2(peon).paxos(paxos active c 9653210..9653763) lease_timeout -- calling new election 2018-01-11 15:16:35.908589 7f69b80d6700 0 mon.my-ceph-mon-5@2(probing) e15 handle_command mon_command({"prefix": "status"} v 0) v1 2018-01-11 15:16:35.908630 7f69b80d6700 0 log_channel(audit) log [DBG] : from='client.? 172.25.24.15:0/1078983440' entity='client.admin' cmd=[{"prefix": "status"}]: dispatch 2018-01-11 15:16:35.909124 7f69b80d6700 0 log_channel(cluster) log [INF] : mon.my-ceph-mon-5 calling new monitor election 2018-01-11 15:16:35.909284 7f69b80d6700 1 mon.my-ceph-mon-5@2(electing).elector(6128) init, last seen epoch 6128 2018-01-11 15:16:50.132414 7f69ba8db700 1 mon.my-ceph-mon-5@2(electing).elector(6129) init, last seen epoch 6129, mid-election, bumping 2018-01-11 15:16:55.209177 7f69b80d6700 -1 mon.my-ceph-mon-5@2(peon).paxos(paxos recovering c 9653210..9653777) lease_expire from mon.0 AAA.BBB.CCC.23:6789/0 is 0.032801 seconds in the past; mons are probably laggy (or possibly clocks are too skewed) 2018-01-11 15:17:09.316472 7f69ba8db700 1 mon.my-ceph-mon-5@2(peon).paxos(paxos updating c 9653210..9653778) lease_timeout -- calling new election 2018-01-11 15:17:09.316597 7f69ba8db700 0 mon.my-ceph-mon-5@2(probing).data_health(6134) update_stats avail 57% total 19548 MB, used 8411 MB, avail 11149 MB 2018-01-11 15:17:09.317414 7f69b80d6700 0 log_channel(cluster) log [INF] : mon.my-ceph-mon-5 calling new monitor election 2018-01-11 15:17:09.317517 7f69b80d6700 1 mon.my-ceph-mon-5@2(electing).elector(6134) init, last seen epoch 6134 2018-01-11 15:17:22.059573 7f69ba8db700 1 mon.my-ceph-mon-5@2(peon).paxos(paxos updating c 9653210..9653779) lease_timeout -- calling new election 2018-01-11 15:17:22.060021 7f69b80d6700 1 mon.my-ceph-mon-5@2(probing).data_health(6138) service_dispatch_op not in quorum -- drop message 2018-01-11 15:17:22.060279 7f69b80d6700 1 mon.my-ceph-mon-5@2(probing).data_health(6138) service_dispatch_op not in quorum -- drop message 2018-01-11 15:17:22.060499 7f69b80d6700 0 log_channel(cluster) log [INF] : mon.my-ceph-mon-5 calling new monitor election 2018-01-11 15:17:22.060612 7f69b80d6700 1 mon.my-ceph-mon-5@2(electing).elector(6138) init, last seen epoch 6138 ... As far as I can see clock skew is not a problem (tested with "ntpq -p"). Any idea what might go wrong? Thanks, Thomas ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] osd heartbeat protocol issue on upgrade v12.1.0 ->v12.2.0
Hello, thank you very much for the hint, you are right! Kind regards, Thomas Marc Roos schrieb am 30.08.2017 um 14:26: > > I had this also once. If you update all nodes and then systemctl restart > 'ceph-osd@*' on all nodes, you should be fine. But first the monitors of > course ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] osd heartbeat protocol issue on upgrade v12.1.0 ->v12.2.0
Hello, when I upgraded (yet a single osd node) from v12.1.0 -> v12.2.0 its osds start flapping and finally got all marked as down. As far as I can see, this is due to an incompatibility of the osd heartbeat protocol between the two versions: v12.2.0 node: 7f4f7b6e6700 -1 osd.X 3879 heartbeat_check: no reply from x.x.x.x: osd.Y ever on either front or back, first ping sent ... v12.1.0 node: 7fd854ebf700 -1 failed to decode message of type 70 v4: buffer::malformed_input: void osd_peer_stat_t::decode(ceph::buffer::list::iterator&) no longer understand old encoding version 1 < struct_compat ( it is puzzling that the *older* v12.1.0 node complains about the *old* encoding version of the *newer* v12.2.0 node.) Any idea how I can go ahead? Kind regards, Thomas ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] MON daemons fail after creating bluestore osd with block.db partition (luminous 12.1.0-1~bpo90+1 )
Hello, Thomas Gebhardt schrieb am 07.07.2017 um 17:21: > ( e.g., > ceph-deploy osd create --bluestore --block-db=/dev/nvme0bnp1 node1:/dev/sdi > ) just noticed that there was typo in the block-db device name (/dev/nvme0bnp1 -> /dev/nvme0n1p1). After fixing that misspelling my cookbook worked fine and the mons are running. Kind regards, Thomas ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] MON daemons fail after creating bluestore osd with block.db partition (luminous 12.1.0-1~bpo90+1 )
Hello, just testing the latest luminous rc packages on debian stretch with bluestore OSDs. OSDs without a separate block.db partition do fine. But when I try to create an OSD with a separate block.db partition: ( e.g., ceph-deploy osd create --bluestore --block-db=/dev/nvme0bnp1 node1:/dev/sdi ) then all MON daemons fail and the cluster stops running (cf. appended journalctl logs) Do you have any idea how to narrow down the problem? (objdump?) (please note that I faked /etc/debian_version to jessie, since ceph-deploy 1.5.38 from https://download.ceph.com/debian-luminous/dists/stretch/ does not yet support stretch - but I suppose that's not related to my problem). Kind regards, Thomas Jul 07 09:58:54 node1 systemd[1]: Started Ceph cluster monitor daemon. Jul 07 09:58:54 node1 ceph-mon[550]: starting mon.node1 rank 0 at x.x.x.x:6789/0 mon_data /var/lib/ceph/mon/ceph-node1 fsid 1e50b861-c10f-4356-9af6-3a90441ee694 Jul 07 16:38:44 node1 ceph-mon[550]: /build/ceph-12.1.0/src/mon/OSDMonitor.cc: In function 'void OSDMonitor::check_pg_creates_subs()' thread 7f678fe61700 time 2017-07-07 16:38:44.576052 Jul 07 16:38:44 node1 ceph-mon[550]: /build/ceph-12.1.0/src/mon/OSDMonitor.cc: 2977: FAILED assert(osdmap.get_up_osd_features() & CEPH_FEATURE_MON_STATEFUL_SUB) Jul 07 16:38:44 node1 ceph-mon[550]: ceph version 12.1.0 (262617c9f16c55e863693258061c5b25dea5b086) luminous (dev) Jul 07 16:38:44 node1 ceph-mon[550]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x55fc324c8802] Jul 07 16:38:44 node1 ceph-mon[550]: 2: (()+0x474ed0) [0x55fc323c8ed0] Jul 07 16:38:44 node1 ceph-mon[550]: 3: (OSDMonitor::update_from_paxos(bool*)+0x1a4d) [0x55fc323f168d] Jul 07 16:38:44 node1 ceph-mon[550]: 4: (PaxosService::refresh(bool*)+0x3ff) [0x55fc323b673f] Jul 07 16:38:44 node1 ceph-mon[550]: 5: (Monitor::refresh_from_paxos(bool*)+0x1a3) [0x55fc322767e3] Jul 07 16:38:44 node1 ceph-mon[550]: 6: (Paxos::do_refresh()+0x47) [0x55fc323a0227] Jul 07 16:38:44 node1 ceph-mon[550]: 7: (Paxos::commit_finish()+0x703) [0x55fc323b1ad3] Jul 07 16:38:44 node1 ceph-mon[550]: 8: (C_Committed::finish(int)+0x2b) [0x55fc323b55bb] Jul 07 16:38:44 node1 ceph-mon[550]: 9: (Context::complete(int)+0x9) [0x55fc322b2c19] Jul 07 16:38:44 node1 ceph-mon[550]: 10: (MonitorDBStore::C_DoTransaction::finish(int)+0xa0) [0x55fc323b33c0] Jul 07 16:38:44 node1 ceph-mon[550]: 11: (Context::complete(int)+0x9) [0x55fc322b2c19] Jul 07 16:38:44 node1 ceph-mon[550]: 12: (Finisher::finisher_thread_entry()+0x4c0) [0x55fc324c7950] Jul 07 16:38:44 node1 ceph-mon[550]: 13: (()+0x7494) [0x7f679dd95494] Jul 07 16:38:44 node1 ceph-mon[550]: 14: (clone()+0x3f) [0x7f679b27baff] Jul 07 16:38:44 node1 ceph-mon[550]: NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this. Jul 07 16:38:44 node1 ceph-mon[550]: 2017-07-07 16:38:44.581790 7f678fe61700 -1 /build/ceph-12.1.0/src/mon/OSDMonitor.cc: In function 'void OSDMonitor::check_pg_creates_subs()' thread 7f678f Jul 07 16:38:44 node1 ceph-mon[550]: /build/ceph-12.1.0/src/mon/OSDMonitor.cc: 2977: FAILED assert(osdmap.get_up_osd_features() & CEPH_FEATURE_MON_STATEFUL_SUB) Jul 07 16:38:44 node1 ceph-mon[550]: ceph version 12.1.0 (262617c9f16c55e863693258061c5b25dea5b086) luminous (dev) Jul 07 16:38:44 node1 ceph-mon[550]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x55fc324c8802] Jul 07 16:38:44 node1 ceph-mon[550]: 2: (()+0x474ed0) [0x55fc323c8ed0] Jul 07 16:38:44 node1 ceph-mon[550]: 3: (OSDMonitor::update_from_paxos(bool*)+0x1a4d) [0x55fc323f168d] Jul 07 16:38:44 node1 ceph-mon[550]: 4: (PaxosService::refresh(bool*)+0x3ff) [0x55fc323b673f] Jul 07 16:38:44 node1 ceph-mon[550]: 5: (Monitor::refresh_from_paxos(bool*)+0x1a3) [0x55fc322767e3] Jul 07 16:38:44 node1 ceph-mon[550]: 6: (Paxos::do_refresh()+0x47) [0x55fc323a0227] Jul 07 16:38:44 node1 ceph-mon[550]: 7: (Paxos::commit_finish()+0x703) [0x55fc323b1ad3] Jul 07 16:38:44 node1 ceph-mon[550]: 8: (C_Committed::finish(int)+0x2b) [0x55fc323b55bb] Jul 07 16:38:44 node1 ceph-mon[550]: 9: (Context::complete(int)+0x9) [0x55fc322b2c19] Jul 07 16:38:44 node1 ceph-mon[550]: 10: (MonitorDBStore::C_DoTransaction::finish(int)+0xa0) [0x55fc323b33c0] Jul 07 16:38:44 node1 ceph-mon[550]: 11: (Context::complete(int)+0x9) [0x55fc322b2c19] Jul 07 16:38:44 node1 ceph-mon[550]: 12: (Finisher::finisher_thread_entry()+0x4c0) [0x55fc324c7950] Jul 07 16:38:44 node1 ceph-mon[550]: 13: (()+0x7494) [0x7f679dd95494] Jul 07 16:38:44 node1 ceph-mon[550]: 14: (clone()+0x3f) [0x7f679b27baff] Jul 07 16:38:44 node1 ceph-mon[550]: NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this. Jul 07 16:38:44 node1 ceph-mon[550]: 0> 2017-07-07 16:38:44.581790 7f678fe61700 -1 /build/ceph-12.1.0/src/mon/OSDMonitor.cc: In function 'void OSDMonitor::check_pg_creates_subs()' threa Jul 07 16:38:44 node1 ceph-mon[550]: