At cluster creation I'm seeing that the mons are taking a while time to form 
quorum. It seems like I'm hitting a timeout of 60s somewhere. Am I missing a 
config setting that would help paxos establish quorum sooner? When initializing 
with the monmap I would have expected the mons to initialize very quickly.

The scenario is:

  *   Luminous RC 2
  *   The mons are initialized with a monmap
  *   Running in Kubernetes (Rook)

The symptoms are:

  *   When all three mons start in parallel, they appear to determine their 
rank immediately. I assume this means they establish communication. A log 
message is seen such as this in each of the mon logs:
     *   2017-08-08 17:03:16.383599 7f8da7c85f40  0 
mon.rook-ceph-mon1@-1(probing) e0  my rank is now 0 (was –1)
  *   Now paxos enters a loop that times out every two seconds and lasts about 
60s, trying to probe the other monitors. During this wait, I am able to curl 
the mon endpoints successfully.
     *   2017-08-08 17:03:17.345877 7f02b779af40 10 
mon.rook-ceph-mon0@1(probing) e0 probing other monitors
     *   2017-08-08 17:03:19.346032 7f02ae568700  4 
mon.rook-ceph-mon0@1(probing) e0 probe_timeout 0x55c93678bb00
  *   After about 60 seconds the probe succeeds and the mons start responding
     *   2017-08-08 17:04:17.356928 7f02ae568700 10 
mon.rook-ceph-mon0@1(probing) e0 probing other monitors
     *   2017-08-08 17:04:17.366587 7f02a855c700 10 
mon.rook-ceph-mon0@1(probing) e0 ms_verify_authorizer mon 
protocol 2

The relevant settings in the config are:
mon initial members  = rook-ceph-mon0 rook-ceph-mon1 rook-ceph-mon2
mon host                      =,,
public addr                   =
cluster addr                  =

The full log for this mon at debug log level 20 can be found here:

Any ideas?

ceph-users mailing list

Reply via email to