Github user SleepyThread commented on the pull request:
https://github.com/apache/spark/pull/4027#issuecomment-141579731
@andrewor14 @pwendell @tnachen @dragos. In my company we have patched the
Spark to specify minimum and maximum no of core per executor with only one
executor running on each slave. I have a patch ready for it and waiting for my
refactoring #8771 to be merge in master.
Benefit of this approach over above approach are,
Assuming,
core max required is 30 and memory per task is 10G. Mesos has 10 slave with
32 Cores and 64 GB memory.
1. When there is offer to spark of 1 CPU and 50 GB memory on 5 different
slave [ Assuming slave are running CPU heavy job at the moment]. Each of this
offer will be accepted and hence there will be 5 executor running on 5 slaves
with having 5 cores but 50 GB of memory from the cluster. When i specify the
minimum no of executor on each slave. These offer will be reject.
2. When there is offer to spark of 30 CPU and 50 GB then this offer will be
accepted and whole of our spark job will be running on same slave. If this
slave is lost, then whole spark job will be gone for a toss. If we specify max
no of core per executor as 10. Then offers will be distributed in cluster and
will not be running on single machine.
Max and min will give user to control to the user to cap the amount of
resource used by the spark job but will give mesos elasticity to schedule jobs
on various different machine.
Any thoughts?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]