[ https://issues.apache.org/jira/browse/MESOS-6854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15827207#comment-15827207 ]
Benjamin Mahler commented on MESOS-6854: ---------------------------------------- [~guoger] I filed MESOS-6940 to replace this ticket, I think we want to avoid sending offers entirely. The situation you mention is ok so long as the framework never changed its roles, which we will impose in phase 1. However, in phase 2, we don't have a means to know if the framework changed its role during the lifetime of the agent. The only means we have to check is to inspect the active executor and tasks and this only tells us that the framework didn't change its role during the lifetime of the active tasks and executors on the agent. If the framework changed its role before these were launched, we might accidentally expose the agent to a changed framework role and the old agent doesn't handle role changes. Since this is rather complicated, and since this is new functionality, I think we can say: "if you want to use a MULTI_ROLE framework, upgrade your cluster to 1.z. If there are agents registered that have not been upgraded, the MULTI_ROLE framework will not receive any offers for this agent. If the MULTI_ROLE framework was previously running without the MULTI_ROLE capability and has long running tasks on non-MULTI_ROLE agents, these tasks will continue to run". We'll need to be precise about this kind of thing in the upgrade notes that we publish. > Prevent launching MULTI_ROLE framework's tasks on agents without MULTI_ROLE > support. > ------------------------------------------------------------------------------------ > > Key: MESOS-6854 > URL: https://issues.apache.org/jira/browse/MESOS-6854 > Project: Mesos > Issue Type: Task > Components: agent, master > Reporter: Benjamin Mahler > Assignee: Jay Guo > > The proposal for upgrades / backwards compatibility in phase 1 of multi-role > framework support is that we require that masters and agents are all upgraded > before a multi-role framework registers. > We need to explicitly protect against this situation occurring given it's > common for old agents to show up in a cluster. The master can prevent the > launching of MULTI_ROLE frameworks' tasks on agent without MULTI_ROLE > framework support. > If we were to naively let this happen the old agent would think the resources > are allocated to the "*" and there would need to be master logic to deal with > the old agent not populating Resource.AllocationInfo. > The guard will either need to be version based or agent capability based, the > latter seeming like the stronger approach given some users upgrade off of > master rather than using release versions. > We can initially start with the master side guard, and have the agent send > the capability once the agent-side implementation is complete. -- This message was sent by Atlassian JIRA (v6.3.4#6332)