[ 
https://issues.apache.org/jira/browse/MESOS-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-5833:
-------------------------------
    Description: 
I'd like to propose that we disable the {{--registry_strict}} master flag for 
Mesos 1.0.0.

Rationale:

* By default, when a partitioned agent tries to reregister, Mesos will kill the 
tasks and shutdown the agent ("kill semantics") if the master has NOT failed 
over. If the master has failed-over, it will NOT kill the tasks and will allow 
the agent to re-register ("no-kill semantics").
* If {{--registry_strict}} is enabled, Mesos will implement "kill" in both 
cases (master fail-over or not).
* In the future, we want Mesos to implement "no kill" behavior in both cases, 
as described further in MESOS-4049.

Hence, allowing Mesos installations to set {{--registry_strict}} moves them 
*away* from the future default behavior ("no-kill / no-kill") -- i.e., if you 
assume "kill/kill" and write your framework accordingly, it will be harder to 
migrate to the new behavior described by MESOS-4049. Since there are basically 
no circumstances in which we would recommend that someone set this flag to 
true, I think we should prevent users from enabling this behavior. (This flag 
is also clearly marked as "experimental" and "not for production use", and I'm 
not aware of any Mesos users that have enabled it.)

The proposed change (RR below) would change the master so that it will refuse 
to startup if the {{--registry_strict}} flag is specified.

All the code for the strict registry code path will be retained, so it will be 
easy to revert this change if we do find Mesos installations that depend on the 
current "kill/kill" semantics enabled by {{--registry_strict}}. However, a 
recent email to the {{user}} and {{dev}} lists did not result in anyone 
volunteering that they are using the strict registry.

  was:
I'd like to propose that we disable the {{--registry_strict}} master flag for 
Mesos 1.0.0.

Rationale:

* This feature has always been marked as experimental and sparsely documented.
* The strict registry will no longer be relevant once support for 
partition-aware frameworks is introduced (MESOS-4049).
* If we can assume that no frameworks are assuming the strict-registry 
behavior, we can simplify the design/implementation of MESOS-4049. It also 
makes the system's current behavior easier to explain, and makes the current 
LOST behavior more similar to the proposed UNREACHABLE behavior in MESOS-4049.

Note that if we disable the {{--registry_strict}} flag but keep the code (for 
now), we can always re-enable the feature if any users complain. However, a 
recent email to the {{user}} and {{dev}} lists did not result in anyone 
volunteering that they are using the strict registry.


> Disable strict registry
> -----------------------
>
>                 Key: MESOS-5833
>                 URL: https://issues.apache.org/jira/browse/MESOS-5833
>             Project: Mesos
>          Issue Type: Improvement
>          Components: master
>            Reporter: Neil Conway
>            Assignee: Neil Conway
>            Priority: Blocker
>              Labels: mesosphere
>             Fix For: 1.0.0
>
>
> I'd like to propose that we disable the {{--registry_strict}} master flag for 
> Mesos 1.0.0.
> Rationale:
> * By default, when a partitioned agent tries to reregister, Mesos will kill 
> the tasks and shutdown the agent ("kill semantics") if the master has NOT 
> failed over. If the master has failed-over, it will NOT kill the tasks and 
> will allow the agent to re-register ("no-kill semantics").
> * If {{--registry_strict}} is enabled, Mesos will implement "kill" in both 
> cases (master fail-over or not).
> * In the future, we want Mesos to implement "no kill" behavior in both cases, 
> as described further in MESOS-4049.
> Hence, allowing Mesos installations to set {{--registry_strict}} moves them 
> *away* from the future default behavior ("no-kill / no-kill") -- i.e., if you 
> assume "kill/kill" and write your framework accordingly, it will be harder to 
> migrate to the new behavior described by MESOS-4049. Since there are 
> basically no circumstances in which we would recommend that someone set this 
> flag to true, I think we should prevent users from enabling this behavior. 
> (This flag is also clearly marked as "experimental" and "not for production 
> use", and I'm not aware of any Mesos users that have enabled it.)
> The proposed change (RR below) would change the master so that it will refuse 
> to startup if the {{--registry_strict}} flag is specified.
> All the code for the strict registry code path will be retained, so it will 
> be easy to revert this change if we do find Mesos installations that depend 
> on the current "kill/kill" semantics enabled by {{--registry_strict}}. 
> However, a recent email to the {{user}} and {{dev}} lists did not result in 
> anyone volunteering that they are using the strict registry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to