[
https://issues.apache.org/jira/browse/MESOS-2818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14577632#comment-14577632
]
Jie Yu edited comment on MESOS-2818 at 6/8/15 6:44 PM:
-------------------------------------------------------
Chatted this with BenM and Vinod, there is still a race condition that this
solution cannot prevent. Here is the timeline:
1) Slave wants to call estimator->oversubscribable(..)
2) Slave calculates 'allocated' and pass it to 'oversubscribable(..)'
3) A new task T gets launched in the slave
4) Resource estimator calls 'usages()'
5) The results from 'usages()' has information about T while T does not exist
in 'allocated'
6) The resource estimator does not know how to calculate usage slack for T
was (Author: jieyu):
Chatted this with BenM and Vinod, there is still a race condition that this
solution cannot prevent. Here is the timeline:
1) Slave wants to call estimator->oversubscribable(..)
2) Slave calculates 'allocated' and pass it to 'oversubscribable(..)'
3) A new task T gets launched in the slave
4) Resource estimator calls 'usages()'
5) The results from 'usages()' has information about T while T does not exist
in 'allocated'
6) The resource estimator does not know how to calculate usage slack
> Pass 'allocated' resources for each executor to the resource estimator.
> -----------------------------------------------------------------------
>
> Key: MESOS-2818
> URL: https://issues.apache.org/jira/browse/MESOS-2818
> Project: Mesos
> Issue Type: Task
> Reporter: Jie Yu
> Assignee: Jie Yu
>
> Resource estimator obviously need this information to calculate, say the
> usage slack. Now the question is how. There are two approaches:
> 1) Pass in the allocated resources for each executor through the
> 'oversubscribable()' interface.
> 2) Let containerizer return total resources allocated for each container when
> 'usages()' are invoked.
> I would suggest to take route (1) for several reasons:
> 1) Eventually, we'll need to pass in slave's total resources to the resource
> estimator (so that RE can calculate allocation slack). There is no way that
> we can get that from containerizer. The slave's total resources keep changing
> due to dynamic reservation. So we cannot pass in the slave total resources
> during initialization.
> 2) The current implementation of usages() might skip some containers if it
> fails to get statistics for that container (not an error). This will cause
> in-complete information to the RE.
> 3) We may want to calculate 'unallocated = total - allocated' so that we can
> send allocation slack as well. Getting 'total' and 'allocated' from two
> different components might result in inconsistent value. Remember that
> 'total' keeps changing due to dynamic reservation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)