[ https://issues.apache.org/jira/browse/MESOS-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yan Xu reassigned MESOS-4975: ----------------------------- Assignee: Yan Xu > mesos::internal::master::Slave::tasks can grow unboundedly > ---------------------------------------------------------- > > Key: MESOS-4975 > URL: https://issues.apache.org/jira/browse/MESOS-4975 > Project: Mesos > Issue Type: Bug > Components: master > Reporter: Yan Xu > Assignee: Yan Xu > > So in a Mesos cluster we observed the following > {noformat:title=} > $ jq '.orphan_tasks | length' state.json > 1369 > $ jq '.unregistered_frameworks | length' state.json > 20162 > {noformat} > Aside from {{unregistered_frameworks}} here being "the list of frameworkIDs > for each orphan task" (described in MESOS-4973), the discrepancy between the > two values above is surprising. > I think the problem is that we do this in the master: > From > [source|https://github.com/apache/mesos/blob/e376d3aa0074710278224ccd17afd51971820dfb/src/master/master.cpp#L2212]: > {code} > foreachvalue (Slave* slave, slaves.registered) { > foreachvalue (Task* task, slave->tasks[framework->id()]) { > framework->addTask(task); > } > foreachvalue (const ExecutorInfo& executor, > slave->executors[framework->id()]) { > framework->addExecutor(slave->id, executor); > } > } > {code} > Here an {{operator[]}} is used whenever a framework subscribes regardless of > whether this agent has tasks for the framework or not. > If the agent has no such task for this framework, then this \{frameworkID: > empty hashmap\} entry will stay in the map indefinitely! If frameworks are > ephemeral and new ones keep come in, the map grows unboundedly. > We should do {{tasks.contains(frameworkId)}} before using the {{[] operator}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)