[jira] [Updated] (MESOS-4975) mesos::internal::master::Slave::tasks can grow unboundedly

2016-10-28 Thread Yan Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-4975:
--
Fix Version/s: 1.0.2

> mesos::internal::master::Slave::tasks can grow unboundedly
> --
>
> Key: MESOS-4975
> URL: https://issues.apache.org/jira/browse/MESOS-4975
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Reporter: Yan Xu
>Assignee: Yan Xu
> Fix For: 1.0.2, 1.1.0, 1.2.0
>
>
> So in a Mesos cluster we observed the following
> {noformat:title=}
> $ jq '.orphan_tasks | length' state.json
> 1369
> $ jq '.unregistered_frameworks | length' state.json
> 20162
> {noformat}
> Aside from {{unregistered_frameworks}} here being "the list of frameworkIDs 
> for each orphan task" (described in MESOS-4973), the discrepancy between the 
> two values above is surprising.
> I think the problem is that we do this in the master:
> From 
> [source|https://github.com/apache/mesos/blob/e376d3aa0074710278224ccd17afd51971820dfb/src/master/master.cpp#L2212]:
> {code}
> foreachvalue (Slave* slave, slaves.registered) {
>   foreachvalue (Task* task, slave->tasks[framework->id()]) {
> framework->addTask(task);
>   }
>   foreachvalue (const ExecutorInfo& executor,
> slave->executors[framework->id()]) {
> framework->addExecutor(slave->id, executor);
>   }
> }
> {code}
> Here an {{operator[]}} is used whenever a framework subscribes regardless of 
> whether this agent has tasks for the framework or not.
> If the agent has no such task for this framework, then this \{frameworkID: 
> empty hashmap\} entry will stay in the map indefinitely! If frameworks are 
> ephemeral and new ones keep come in, the map grows unboundedly.
> We should do {{tasks.contains(frameworkId)}} before using the {{[] operator}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4975) mesos::internal::master::Slave::tasks can grow unboundedly

2016-10-28 Thread Yan Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-4975:
--
Fix Version/s: 1.2.0

> mesos::internal::master::Slave::tasks can grow unboundedly
> --
>
> Key: MESOS-4975
> URL: https://issues.apache.org/jira/browse/MESOS-4975
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Reporter: Yan Xu
>Assignee: Yan Xu
> Fix For: 1.1.0, 1.2.0
>
>
> So in a Mesos cluster we observed the following
> {noformat:title=}
> $ jq '.orphan_tasks | length' state.json
> 1369
> $ jq '.unregistered_frameworks | length' state.json
> 20162
> {noformat}
> Aside from {{unregistered_frameworks}} here being "the list of frameworkIDs 
> for each orphan task" (described in MESOS-4973), the discrepancy between the 
> two values above is surprising.
> I think the problem is that we do this in the master:
> From 
> [source|https://github.com/apache/mesos/blob/e376d3aa0074710278224ccd17afd51971820dfb/src/master/master.cpp#L2212]:
> {code}
> foreachvalue (Slave* slave, slaves.registered) {
>   foreachvalue (Task* task, slave->tasks[framework->id()]) {
> framework->addTask(task);
>   }
>   foreachvalue (const ExecutorInfo& executor,
> slave->executors[framework->id()]) {
> framework->addExecutor(slave->id, executor);
>   }
> }
> {code}
> Here an {{operator[]}} is used whenever a framework subscribes regardless of 
> whether this agent has tasks for the framework or not.
> If the agent has no such task for this framework, then this \{frameworkID: 
> empty hashmap\} entry will stay in the map indefinitely! If frameworks are 
> ephemeral and new ones keep come in, the map grows unboundedly.
> We should do {{tasks.contains(frameworkId)}} before using the {{[] operator}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)