Yan Xu created MESOS-8630:
-----------------------------

             Summary: All subsequent registry operations fail after the 
registrar is aborted after a failed update
                 Key: MESOS-8630
                 URL: https://issues.apache.org/jira/browse/MESOS-8630
             Project: Mesos
          Issue Type: Bug
          Components: master
            Reporter: Yan Xu


Failure to update registry always aborts the registrar but don't always abort 
the master process.

When the registrar fails to update the registry it would abort the actor and 
fail all future operations. The rationale as explained here: 
[https://github.com/apache/mesos/commit/5eaf1eb346fc2f46c852c1246bdff12a89216b60]
{quote}In this event, the Master won't commit suicide until the initial
 failure is processed. However, in the interim, subsequent operations
 are potentially being performed against the Registrar. This could lead
 to fighting between masters if a "demoted" master re-attempts to
 acquire log-leadership!
{quote}
However when the registrar updates is requested by an operator API 
(maintenance, quota update, etc) the master process doesn't shut down (a 500 
error is returned to the client instead) and all subsequent operations will 
fail!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to