[ https://issues.apache.org/jira/browse/MESOS-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yan Xu reassigned MESOS-8630: ----------------------------- Assignee: Xudong Ni > All subsequent registry operations fail after the registrar is aborted after > a failed update > -------------------------------------------------------------------------------------------- > > Key: MESOS-8630 > URL: https://issues.apache.org/jira/browse/MESOS-8630 > Project: Mesos > Issue Type: Bug > Components: master > Reporter: Yan Xu > Assignee: Xudong Ni > Priority: Major > > Failure to update registry always aborts the registrar but don't always abort > the master process. > When the registrar fails to update the registry it would abort the actor and > fail all future operations. The rationale as explained here: > [https://github.com/apache/mesos/commit/5eaf1eb346fc2f46c852c1246bdff12a89216b60] > {quote}In this event, the Master won't commit suicide until the initial > failure is processed. However, in the interim, subsequent operations > are potentially being performed against the Registrar. This could lead > to fighting between masters if a "demoted" master re-attempts to > acquire log-leadership! > {quote} > However when the registrar updates is requested by an operator API > (maintenance, quota update, etc) the master process doesn't shut down (a 500 > error is returned to the client instead) and all subsequent operations will > fail! -- This message was sent by Atlassian JIRA (v7.6.3#76005)