Bumping this up. I'm guessing people haven't had time to review, it would be
great to get feedback on this.
Thanks,Tom
On Tuesday, August 6, 2019, 2:27:49 PM CDT, Tom Graves
<[email protected]> wrote:
Hey everyone,
I have been working on coming up with a proposal for supporting stage level
resource configuration and scheduling. The basic idea is to allow the user to
specify executor and task resource requirements for each stage to allow the
user to control the resources required at a finer grain. One good example here
is doing some ETL to preprocess your data in one stage and then feed that data
into an ML algorithm (like tensorflow) that would run as a separate stage. The
ETL could need totally different resource requirements for the executors/tasks
than the ML stage does.
If you are interested please take a look at the SPIP and give me feedback. The
text for the SPIP is in the jira description:
https://issues.apache.org/jira/browse/SPARK-27495
I split the API and Design parts into a google doc that is linked to from the
jira.
Thanks,Tom