> On Oct. 1, 2013, 8:17 p.m., Chi Zhang wrote:
> > This is an issue bigger than I thought. The resources in the task only gets 
> > accounted when you launch the task. Copying it earlier gets the first 
> > task's resources doubled counted when it is actually started after the 
> > executor is launched. Whether or it a task is started and the resources 
> > associated with it need to be taken account of separately. 
> > 
> > Any thoughts?
> 
> Ben Mahler wrote:
>     Is it that some resource subsystems require non-zero resources when the 
> executor is launched? If the answer is yes, can we have a minimum initial 
> resource allocation (akin to what is done in CgroupsIsolator)? See the 
> following constants:
>     
>     // CPU subsystem constants.
>     const size_t CPU_SHARES_PER_CPU = 1024;
>     const size_t MIN_CPU_SHARES = 10;
>     const Duration CPU_CFS_PERIOD = Milliseconds(100); // Linux default.
>     const Duration MIN_CPU_CFS_QUOTA = Milliseconds(1);
>     
>     // Memory subsystem constants.
>     const Bytes MIN_MEMORY = Megabytes(32);
>     
>     It's not ideal but it may be a simpler solution to your problem.

That would do for now since we aren't adding new features, but part of our goal 
in the current refactoring work is to allow different combinations of resource 
isolation modules to be used for different executors. Resource such as a disk 
partition would require non-zero requirement to initialize. There are also 
other more 'optional' (than cpu, mem, disk and port) features like namespaces 
we are also trying to provide a foundation for. namespaces might not take a 
number to initialize but it affects which system api is used in the launcher 
when it comes to implementation details. 

All these would require some form of pass-down from slave, extracted out from 
the task message, when an executor is launched. I am still thinking the 
'resources' argument for launchExecutor should be the candidate to pass them 
along.


- Chi


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14414/#review26580
-----------------------------------------------------------


On Sept. 30, 2013, 9:12 p.m., Chi Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14414/
> -----------------------------------------------------------
> 
> (Updated Sept. 30, 2013, 9:12 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Ben Mahler, Ian Downes, Jie Yu, 
> David Mackey, Vinod Kone, and Jiang Yan Xu.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
>     slave: Copy resource requirements from the first TaskInfo message to the 
>     ExecutorInfo before an executor is launched.
>                              
>     Otherwise, this leads to a null value passed to launchExecutor for the    
>  
>     resources field. It's necessary for some resource subsystems to 
> initialize 
>     executors with resource requirement upfront.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 0ad4576 
> 
> Diff: https://reviews.apache.org/r/14414/diff/
> 
> 
> Testing
> -------
> 
> Can't tell for sure. With or without the patch, `make -j check` fails at the 
> same place on a Mesos dev box.
> 
> [----------] Global test environment tear-down                 
> [==========] 263 tests from 47 test cases ran. (146351 ms total)              
>  
> [  PASSED  ] 259 tests.                                                       
> [  FAILED  ] 4 tests, listed below:                                           
> [  FAILED  ] CgroupsIsolatorTest.ROOT_CGROUPS_BalloonFramework                
> [  FAILED  ] SASL.success                                                     
> [  FAILED  ] SASL.failed1                                                     
>  
> [  FAILED  ] SASL.failed2                                                     
>  
>                                                                               
>  
>  4 FAILED TESTS                                                              
> make[3]: *** [check-local] Error 1                                           
> make[3]: Leaving directory `/home/czhang/mesos-apache/build/src'             
> make[2]: *** [check-am] Error 2                                               
> make[2]: Leaving directory `/home/czhang/mesos-apache/build/src'             
> make[1]: *** [check] Error 2                                                 
> make[1]: Leaving directory `/home/czhang/mesos-apache/build/src'             
> make: *** [check-recursive] Error 1                                          
> Connection to smfd-aki-27-sr1.devel.twitter.com closed.
> 
> 
> Thanks,
> 
> Chi Zhang
> 
>

Reply via email to