[
https://issues.apache.org/jira/browse/MESOS-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443790#comment-13443790
]
Vinod Kone commented on MESOS-262:
----------------------------------
These are all valid alternatives to fix this.
In the long term, though, there are a few things we should keep in mind
1) We should account for the resources for launching/preparing executor. Its
not hard to imagine that, if there are multiple parallel launches of executors,
the resource usage of the slave will blow up.
2) Executor launches could be serialized (launcher as a separate libprocess so
that we can queue launches) to explicitly control max resource usage for
launching.
3) Cache the executors to avoid re-downloads.
> Slave should not charge the resources required for launching a executor
> against the executor
> ---------------------------------------------------------------------------------------------
>
> Key: MESOS-262
> URL: https://issues.apache.org/jira/browse/MESOS-262
> Project: Mesos
> Issue Type: Bug
> Reporter: Vinod Kone
>
> This is exacerbated when using cgroups isolation module on the slave.
> At Twitter, we have seen this manifest as executors being killed by the
> cgroups isolation module. This happened because the high memory footprint of
> the hdfs download (~400MB) of the executor exceeds the memory requested by
> the executor (128MB) for itself.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira