-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68706/
-----------------------------------------------------------

(Updated Oct. 19, 2018, 11:14 p.m.)


Review request for mesos, Benjamin Mahler, James Peach, and Jiang Yan Xu.


Bugs: MESOS-9178
    https://issues.apache.org/jira/browse/MESOS-9178


Repository: mesos


Description
-------

During the master failover, the time that the master elected is
considered as the start of failover. In the progress of
reregistration, the percentile represents the time when such
percentile of agents finished registration again; The percentile of
these data as in this metrics can represent overall reregistration
progress; In case of degradation towards to the end of reregistration,
the high percentile can reflect it; In the case there are unreachable
agents in the failover, if certain percentile recovery couldn't be
reached, the intiail value of that percentile will not be updated.


Diffs (updated)
-----

  src/master/master.cpp 868787bb2f9d879531402f83507b322462322efc 
  src/master/metrics.hpp e1da18e6ba2737f729e1e30653020538150ae898 


Diff: https://reviews.apache.org/r/68706/diff/6/

Changes: https://reviews.apache.org/r/68706/diff/5-6/


Testing (updated)
-------

Automation:
[ RUN      ] MasterTest.MetricsInMetricsEndpoint
[       OK ] MasterTest.MetricsInMetricsEndpoint (42 ms)

Real world cases:

While the master is not elected or there is no agents to recover
"master/recovered_agents_100_percent_reregistered_secs": 0.0,
"master/recovered_agents_25_percent_reregistered_secs": 0.0,
"master/recovered_agents_50_percent_reregistered_secs": 0.0,
"master/recovered_agents_75_percent_reregistered_secs": 0.0,
"master/recovered_agents_90_percent_reregistered_secs": 0.0,
"master/recovered_agents_99_percent_reregistered_secs": 0.0,


While reregistrations is in progress: 4 out of 6 completed:
"master/recovered_agents_100_percent_reregistered_secs": 0.0,
"master/recovered_agents_25_percent_reregistered_secs": 1.0,
"master/recovered_agents_50_percent_reregistered_secs": 8.0,
"master/recovered_agents_75_percent_reregistered_secs": 16.0,
"master/recovered_agents_90_percent_reregistered_secs": 0.0,
"master/recovered_agents_99_percent_reregistered_secs": 0.0,
"master/slave_reregistrations": 4.0,


While 6 reregistrations were all completed:
"master/recovered_agents_100_percent_reregistered_secs": 39.0,
"master/recovered_agents_25_percent_reregistered_secs": 1.0,
"master/recovered_agents_50_percent_reregistered_secs": 8.0,
"master/recovered_agents_75_percent_reregistered_secs": 16.0,
"master/recovered_agents_90_percent_reregistered_secs": 31.0,
"master/recovered_agents_99_percent_reregistered_secs": 31.0,
"master/slave_reregistrations": 6.0,


Thanks,

Xudong Ni

Reply via email to