[
https://issues.apache.org/jira/browse/MESOS-7711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yan Xu updated MESOS-7711:
--------------------------
Shepherd: James Peach
> Master updates registry for reregistering agents even when they haven't been
> unreachable
> ----------------------------------------------------------------------------------------
>
> Key: MESOS-7711
> URL: https://issues.apache.org/jira/browse/MESOS-7711
> Project: Mesos
> Issue Type: Bug
> Components: master
> Reporter: Yan Xu
> Assignee: Yan Xu
>
> During a master failover we observed many registry updates, on average _one
> per two agents_, as indicated by the log line
> {noformat:title=}
> I0609 04:46:25.220196 48864 registrar.cpp:550] Successfully updated the
> registry in 42.904064ms
> {noformat}
> [code|https://github.com/apache/mesos/blob/19a6134d03141dc2cb073a904378c2c129b5138d/src/master/registrar.cpp#L550]
> In this case few agents were ever unreachable so most of them are redundant.
> Associated with each registry update is also the time spent on applying the
> operations
> {noformat:title=}
> I0609 04:46:26.475761 48897 registrar.cpp:493] Applied 1 operations in
> 11.673082ms; attempting to update the registry
> {noformat}
> [code|https://github.com/apache/mesos/blob/19a6134d03141dc2cb073a904378c2c129b5138d/src/master/registrar.cpp#L493]
> Even though not consuming the time of the Master actor, all agent
> reregistrations are guarded and delayed by these operations, and this could
> be easily avoided by checking with the {{slaves.recovered}} field in
> {{Master}}.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)