[ 
https://issues.apache.org/jira/browse/MESOS-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443766#comment-13443766
 ] 

Jie Yu commented on MESOS-262:
------------------------------

OK, one possible fix I can think of is: set resource limit in child process 
before execl is called (after the download) in ExecutorLauncher::run(). That 
may require to split ExecutorLauncher::run() into two methods: prepare() and 
exec(), so that in the isolation module, we can do the following:

{code}
pid_t pid = fork();
if (pid) {
  // parent
  ...
} else {
  // child
  ...
  launcher::ExecutorLauncher* launcher =
      createExecutorLauncher(frameworkId,
                             frameworkInfo,
                             executorInfo,
                             directory);
  launcher->prepare();
  
  // set resource limit
  resourcesChanged(...);

  launcher->exec();
}
{code}
                
> Slave should not charge the resources required for launching a executor 
> against the executor 
> ---------------------------------------------------------------------------------------------
>
>                 Key: MESOS-262
>                 URL: https://issues.apache.org/jira/browse/MESOS-262
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Vinod Kone
>
> This is exacerbated when using cgroups isolation module on the slave.
> At Twitter, we have seen this manifest as executors being killed by the 
> cgroups isolation module. This happened because the high memory footprint of 
> the hdfs download (~400MB) of the executor exceeds the memory requested by 
> the executor (128MB) for itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to