[ceph-users] Unable to join additional mon servers (luminous)

2018-01-11 Thread Thomas Gebhardt
Hello,

I'm running a ceph-12.2.2 cluster on debian/stretch with three mon
servers, unsuccessfully trying to add another (or two additional) mon
servers. While the new mon server keeps in state "synchronizing", the
old mon servers get out of quorum, endlessly changing state from "peon"
to "electing" or "probing", and eventually back to "peon" or "leader".

On a small test cluster everthing works as expected, the new mons
painlessly join the cluster. But on my production cluster I always run
into trouble, both with ceph-deploy and manual intervention. Probably
I'm missing some fundamental factor. Maybe anyone can give me a hint?

These are the existing mons:

my-ceph-mon-3: IP AAA.BBB.CCC.23
my-ceph-mon-4: IP AAA.BBB.CCC.24
my-ceph-mon-5: IP AAA.BBB.CCC.25

Trying to add

my-ceph-mon-1: IP AAA.BBB.CCC.31

Here is a (hopefully) relevant and representative part of the logs on
my-ceph-mon-5 when my-ceph-mon-1 tries to join:

2018-01-11 15:16:08.340741 7f69ba8db700  0
mon.my-ceph-mon-5@2(peon).data_health(6128) update_stats avail 57% total
19548 MB, used 8411 MB, avail 11149 MB
2018-01-11 15:16:16.830566 7f69b48cf700  0 -- AAA.BBB.CCC.18:6789/0 >>
AAA.BBB.CCC.31:6789/0 conn(0x55d19cac2000 :6789
s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0
l=0).handle_connect_msg accept connect_seq 0 vs existing csq=1
existing_state=STATE_STANDBY
2018-01-11 15:16:16.830582 7f69b48cf700  0 -- AAA.BBB.CCC.18:6789/0 >>
AAA.BBB.CCC.31:6789/0 conn(0x55d19cac2000 :6789
s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0
l=0).handle_connect_msg accept peer reset, then tried to connect to us,
replacing
2018-01-11 15:16:16.831864 7f69b80d6700  1 mon.my-ceph-mon-5@2(peon) e15
 adding peer AAA.BBB.CCC.31:6789/0 to list of hints
2018-01-11 15:16:16.833701 7f69b50d0700  0 -- AAA.BBB.CCC.18:6789/0 >>
AAA.BBB.CCC.31:6789/0 conn(0x55d19c8ca000 :6789
s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0
l=0).handle_connect_msg accept connect_seq 0 vs existing csq=1
existing_state=STATE_STANDBY
2018-01-11 15:16:16.833713 7f69b50d0700  0 -- AAA.BBB.CCC.18:6789/0 >>
AAA.BBB.CCC.31:6789/0 conn(0x55d19c8ca000 :6789
s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0
l=0).handle_connect_msg accept peer reset, then tried to connect to us,
replacing
2018-01-11 15:16:16.834843 7f69b80d6700  1 mon.my-ceph-mon-5@2(peon) e15
 adding peer AAA.BBB.CCC.31:6789/0 to list of hints
2018-01-11 15:16:35.907962 7f69ba8db700  1
mon.my-ceph-mon-5@2(peon).paxos(paxos active c 9653210..9653763)
lease_timeout -- calling new election
2018-01-11 15:16:35.908589 7f69b80d6700  0 mon.my-ceph-mon-5@2(probing)
e15 handle_command mon_command({"prefix": "status"} v 0) v1
2018-01-11 15:16:35.908630 7f69b80d6700  0 log_channel(audit) log [DBG]
: from='client.? 172.25.24.15:0/1078983440' entity='client.admin'
cmd=[{"prefix": "status"}]: dispatch
2018-01-11 15:16:35.909124 7f69b80d6700  0 log_channel(cluster) log
[INF] : mon.my-ceph-mon-5 calling new monitor election
2018-01-11 15:16:35.909284 7f69b80d6700  1
mon.my-ceph-mon-5@2(electing).elector(6128) init, last seen epoch 6128
2018-01-11 15:16:50.132414 7f69ba8db700  1
mon.my-ceph-mon-5@2(electing).elector(6129) init, last seen epoch 6129,
mid-election, bumping
2018-01-11 15:16:55.209177 7f69b80d6700 -1
mon.my-ceph-mon-5@2(peon).paxos(paxos recovering c 9653210..9653777)
lease_expire from mon.0 AAA.BBB.CCC.23:6789/0 is 0.032801 seconds in the
past; mons are probably laggy (or possibly clocks are too skewed)
2018-01-11 15:17:09.316472 7f69ba8db700  1
mon.my-ceph-mon-5@2(peon).paxos(paxos updating c 9653210..9653778)
lease_timeout -- calling new election
2018-01-11 15:17:09.316597 7f69ba8db700  0
mon.my-ceph-mon-5@2(probing).data_health(6134) update_stats avail 57%
total 19548 MB, used 8411 MB, avail 11149 MB
2018-01-11 15:17:09.317414 7f69b80d6700  0 log_channel(cluster) log
[INF] : mon.my-ceph-mon-5 calling new monitor election
2018-01-11 15:17:09.317517 7f69b80d6700  1
mon.my-ceph-mon-5@2(electing).elector(6134) init, last seen epoch 6134
2018-01-11 15:17:22.059573 7f69ba8db700  1
mon.my-ceph-mon-5@2(peon).paxos(paxos updating c 9653210..9653779)
lease_timeout -- calling new election
2018-01-11 15:17:22.060021 7f69b80d6700  1
mon.my-ceph-mon-5@2(probing).data_health(6138) service_dispatch_op not
in quorum -- drop message
2018-01-11 15:17:22.060279 7f69b80d6700  1
mon.my-ceph-mon-5@2(probing).data_health(6138) service_dispatch_op not
in quorum -- drop message
2018-01-11 15:17:22.060499 7f69b80d6700  0 log_channel(cluster) log
[INF] : mon.my-ceph-mon-5 calling new monitor election
2018-01-11 15:17:22.060612 7f69b80d6700  1
mon.my-ceph-mon-5@2(electing).elector(6138) init, last seen epoch 6138
...

As far as I can see clock skew is not a problem (tested with "ntpq -p").

Any idea what might go wrong?

Thanks, Thomas
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd heartbeat protocol issue on upgrade v12.1.0 ->v12.2.0

2017-09-01 Thread Thomas Gebhardt
Hello,

thank you very much for the hint, you are right!

Kind regards, Thomas

Marc Roos schrieb am 30.08.2017 um 14:26:
>  
> I had this also once. If you update all nodes and then systemctl restart 
> 'ceph-osd@*' on all nodes, you should be fine. But first the monitors of 
> course
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] osd heartbeat protocol issue on upgrade v12.1.0 ->v12.2.0

2017-08-30 Thread Thomas Gebhardt
Hello,

when I upgraded (yet a single osd node) from v12.1.0 -> v12.2.0 its osds
start flapping and finally got all marked as down.

As far as I can see, this is due to an incompatibility of the osd
heartbeat protocol between the two versions:

v12.2.0 node:
7f4f7b6e6700 -1 osd.X 3879 heartbeat_check: no reply from x.x.x.x:
osd.Y ever on either front or back, first ping sent ...

v12.1.0 node:
7fd854ebf700 -1 failed to decode message of type 70 v4:
buffer::malformed_input: void
osd_peer_stat_t::decode(ceph::buffer::list::iterator&) no longer
understand old encoding version 1 < struct_compat

( it is puzzling that the *older* v12.1.0 node complains about the *old*
encoding version of the *newer* v12.2.0 node.)

Any idea how I can go ahead?

Kind regards, Thomas
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] MON daemons fail after creating bluestore osd with block.db partition (luminous 12.1.0-1~bpo90+1 )

2017-07-10 Thread Thomas Gebhardt
Hello,

Thomas Gebhardt schrieb am 07.07.2017 um 17:21:
> ( e.g.,
> ceph-deploy osd create --bluestore --block-db=/dev/nvme0bnp1 node1:/dev/sdi
> )

just noticed that there was typo in the block-db device name
(/dev/nvme0bnp1 -> /dev/nvme0n1p1). After fixing that misspelling my
cookbook worked fine and the mons are running.

Kind regards, Thomas
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] MON daemons fail after creating bluestore osd with block.db partition (luminous 12.1.0-1~bpo90+1 )

2017-07-07 Thread Thomas Gebhardt
Hello,

just testing the latest luminous rc packages on debian stretch with
bluestore OSDs.

OSDs without a separate block.db partition do fine.

But when I try to create an OSD with a separate block.db partition:

( e.g.,
ceph-deploy osd create --bluestore --block-db=/dev/nvme0bnp1 node1:/dev/sdi
)

then all MON daemons fail and the cluster stops running
(cf. appended journalctl logs)

Do you have any idea how to narrow down the problem? (objdump?)

(please note that I faked /etc/debian_version to jessie, since
ceph-deploy 1.5.38 from
https://download.ceph.com/debian-luminous/dists/stretch/
does not yet support stretch - but I suppose that's not related to my
problem).

Kind regards, Thomas
Jul 07 09:58:54 node1 systemd[1]: Started Ceph cluster monitor daemon.
Jul 07 09:58:54 node1 ceph-mon[550]: starting mon.node1 rank 0 at 
x.x.x.x:6789/0 mon_data /var/lib/ceph/mon/ceph-node1 fsid 
1e50b861-c10f-4356-9af6-3a90441ee694
Jul 07 16:38:44 node1 ceph-mon[550]: /build/ceph-12.1.0/src/mon/OSDMonitor.cc: 
In function 'void OSDMonitor::check_pg_creates_subs()' thread 7f678fe61700 time 
2017-07-07 16:38:44.576052
Jul 07 16:38:44 node1 ceph-mon[550]: /build/ceph-12.1.0/src/mon/OSDMonitor.cc: 
2977: FAILED assert(osdmap.get_up_osd_features() & 
CEPH_FEATURE_MON_STATEFUL_SUB)
Jul 07 16:38:44 node1 ceph-mon[550]:  ceph version 12.1.0 
(262617c9f16c55e863693258061c5b25dea5b086) luminous (dev)
Jul 07 16:38:44 node1 ceph-mon[550]:  1: (ceph::__ceph_assert_fail(char const*, 
char const*, int, char const*)+0x102) [0x55fc324c8802]
Jul 07 16:38:44 node1 ceph-mon[550]:  2: (()+0x474ed0) [0x55fc323c8ed0]
Jul 07 16:38:44 node1 ceph-mon[550]:  3: 
(OSDMonitor::update_from_paxos(bool*)+0x1a4d) [0x55fc323f168d]
Jul 07 16:38:44 node1 ceph-mon[550]:  4: (PaxosService::refresh(bool*)+0x3ff) 
[0x55fc323b673f]
Jul 07 16:38:44 node1 ceph-mon[550]:  5: 
(Monitor::refresh_from_paxos(bool*)+0x1a3) [0x55fc322767e3]
Jul 07 16:38:44 node1 ceph-mon[550]:  6: (Paxos::do_refresh()+0x47) 
[0x55fc323a0227]
Jul 07 16:38:44 node1 ceph-mon[550]:  7: (Paxos::commit_finish()+0x703) 
[0x55fc323b1ad3]
Jul 07 16:38:44 node1 ceph-mon[550]:  8: (C_Committed::finish(int)+0x2b) 
[0x55fc323b55bb]
Jul 07 16:38:44 node1 ceph-mon[550]:  9: (Context::complete(int)+0x9) 
[0x55fc322b2c19]
Jul 07 16:38:44 node1 ceph-mon[550]:  10: 
(MonitorDBStore::C_DoTransaction::finish(int)+0xa0) [0x55fc323b33c0]
Jul 07 16:38:44 node1 ceph-mon[550]:  11: (Context::complete(int)+0x9) 
[0x55fc322b2c19]
Jul 07 16:38:44 node1 ceph-mon[550]:  12: 
(Finisher::finisher_thread_entry()+0x4c0) [0x55fc324c7950]
Jul 07 16:38:44 node1 ceph-mon[550]:  13: (()+0x7494) [0x7f679dd95494]
Jul 07 16:38:44 node1 ceph-mon[550]:  14: (clone()+0x3f) [0x7f679b27baff]
Jul 07 16:38:44 node1 ceph-mon[550]:  NOTE: a copy of the executable, or 
`objdump -rdS ` is needed to interpret this.
Jul 07 16:38:44 node1 ceph-mon[550]: 2017-07-07 16:38:44.581790 7f678fe61700 -1 
/build/ceph-12.1.0/src/mon/OSDMonitor.cc: In function 'void 
OSDMonitor::check_pg_creates_subs()' thread 7f678f
Jul 07 16:38:44 node1 ceph-mon[550]: /build/ceph-12.1.0/src/mon/OSDMonitor.cc: 
2977: FAILED assert(osdmap.get_up_osd_features() & 
CEPH_FEATURE_MON_STATEFUL_SUB)
Jul 07 16:38:44 node1 ceph-mon[550]:  ceph version 12.1.0 
(262617c9f16c55e863693258061c5b25dea5b086) luminous (dev)
Jul 07 16:38:44 node1 ceph-mon[550]:  1: (ceph::__ceph_assert_fail(char const*, 
char const*, int, char const*)+0x102) [0x55fc324c8802]
Jul 07 16:38:44 node1 ceph-mon[550]:  2: (()+0x474ed0) [0x55fc323c8ed0]
Jul 07 16:38:44 node1 ceph-mon[550]:  3: 
(OSDMonitor::update_from_paxos(bool*)+0x1a4d) [0x55fc323f168d]
Jul 07 16:38:44 node1 ceph-mon[550]:  4: (PaxosService::refresh(bool*)+0x3ff) 
[0x55fc323b673f]
Jul 07 16:38:44 node1 ceph-mon[550]:  5: 
(Monitor::refresh_from_paxos(bool*)+0x1a3) [0x55fc322767e3]
Jul 07 16:38:44 node1 ceph-mon[550]:  6: (Paxos::do_refresh()+0x47) 
[0x55fc323a0227]
Jul 07 16:38:44 node1 ceph-mon[550]:  7: (Paxos::commit_finish()+0x703) 
[0x55fc323b1ad3]
Jul 07 16:38:44 node1 ceph-mon[550]:  8: (C_Committed::finish(int)+0x2b) 
[0x55fc323b55bb]
Jul 07 16:38:44 node1 ceph-mon[550]:  9: (Context::complete(int)+0x9) 
[0x55fc322b2c19]
Jul 07 16:38:44 node1 ceph-mon[550]:  10: 
(MonitorDBStore::C_DoTransaction::finish(int)+0xa0) [0x55fc323b33c0]
Jul 07 16:38:44 node1 ceph-mon[550]:  11: (Context::complete(int)+0x9) 
[0x55fc322b2c19]
Jul 07 16:38:44 node1 ceph-mon[550]:  12: 
(Finisher::finisher_thread_entry()+0x4c0) [0x55fc324c7950]
Jul 07 16:38:44 node1 ceph-mon[550]:  13: (()+0x7494) [0x7f679dd95494]
Jul 07 16:38:44 node1 ceph-mon[550]:  14: (clone()+0x3f) [0x7f679b27baff]
Jul 07 16:38:44 node1 ceph-mon[550]:  NOTE: a copy of the executable, or 
`objdump -rdS ` is needed to interpret this.
Jul 07 16:38:44 node1 ceph-mon[550]:  0> 2017-07-07 16:38:44.581790 
7f678fe61700 -1 /build/ceph-12.1.0/src/mon/OSDMonitor.cc: In function 'void 
OSDMonitor::check_pg_creates_subs()' threa
Jul 07 16:38:44 node1 ceph-mon[550]: