Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

Xiangrui Meng Mon, 04 Mar 2019 08:24:08 -0800

On Mon, Mar 4, 2019 at 7:24 AM Sean Owen <sro...@gmail.com> wrote:

> To be clear, those goals sound fine to me. I don't think voting on
> those two broad points is meaningful, but, does no harm per se. If you
> mean this is just a check to see if people believe this is broadly
> worthwhile, then +1 from me. Yes it is.
>
> That means we'd want to review something more detailed later, whether
> it's a a) design doc we vote on or b) a series of pull requests. Given
> the number of questions this leaves open, a) sounds better and I think
> what you're suggesting. I'd call that the SPIP, but, so what, it's
> just a name. The thing is, a) seems already mostly done, in the second
> document that was attached.

It is far from done. We still need to review the APIs and the design for
each major component:

* Internal changes to Spark job scheduler.
* Interfaces exposed to users.
* Interfaces exposed to cluster managers.
* Standalone / auto-discovery.
* YARN
* K8s
* Mesos
* Jenkins

I try to avoid discussing each of them in this thread because they require
different domain experts. After we have a high-level agreement on adding
accelerator support to Spark. We can kick off the work in parallel. If any
committer thinks a follow-up work still needs an SPIP, we just follow the
SPIP process to resolve it.

> I'm hesitating because i'm not sure why
> it's important to not discuss that level of detail here, as it's
> already available. Just too much noise?

Yes. If we go down one or two levels, we might have to pull in different
domain experts for different questions.

> but voting for this seems like
> endorsing those decisions, as I can only assume the proposer is going
> to continue the design with those decisions in mind.
>

That is certainly not the purpose, which was why there were two docs, not
just one SPIP. The purpose of the companion doc is just to give some
concrete stories and estimate what could be done in Spark 3.0. Maybe we
should update the SPIP doc and make it clear that certain features are
pending follow-up discussions.

>
> What's the next step in your view, after this, and before it's
> implemented? as long as there is one, sure, let's punt. Seems like we
> could begin that conversation nowish.
>

We should assign each major component an "owner" who can lead the follow-up
work, e.g.,

* Internal changes to Spark scheduler
* Interfaces to cluster managers and users
* Standalone support
* YARN support
* K8s support
* Mesos support
* Test infrastructure
* FPGA

Again, for each component the question we should answer first is "Is it
important?" and then "How to implement it?". Community members who are
interested in each discussion should subscribe to the corresponding JIRA.
If some committer think we need a follow-up SPIP, either to make more
members aware of the changes or to reach agreement, feel free to call it
out.

>
> Many of those questions you list are _fine_ for a SPIP, in my opinion.
> (Of course, I'd add what cluster managers are in/out of scope.)
>

I think the two requires more discussion are Mesos and K8s. Let me follow
what I suggested above and try to answer two questions for each:

Mesos:
* Is it important? There are certainly Spark/Mesos users but the overall
usage is going downhill. See the attached Google Trend snapshot.
* How to implement it? I believe it is doable, similarly to other cluster
managers. However, we need to find someone from our community to do the
work. If we cannot find such a person, it would indicate that the feature
is not that important.

K8s:
* Is it important? K8s is the fastest growing manager. But the current
Spark support is experimental. Building features on top would add
additional cost if we want to make changes.
* How to implement it? There is a sketch in the companion doc. Yinan
mentioned three options to expose the inferences to users. We need to
finalize the design and discuss which option is the best to go.

You see that such discussions can be done in parallel. It is not efficient
if we block the work on K8s because we cannot decide whether we should
support Mesos.

>
>
> On Mon, Mar 4, 2019 at 9:07 AM Xiangrui Meng <men...@gmail.com> wrote:
> >
> > What finer "high level" goals do you recommend? To make progress on the
> vote, it would be great if you can articulate more. Current SPIP proposes
> two high-level changes to make Spark accelerator-aware:
> >
> > At cluster manager level, we update or upgrade cluster managers to
> include GPU support. Then we expose user interfaces for Spark to request
> GPUs from them.
> > Within Spark, we update its scheduler to understand available GPUs
> allocated to executors, user task requests, and assign GPUs to tasks
> properly.
> >
> > How do you want to change or refine them? I saw you raised questions
> around Horovod requirements and GPU/memory allocation. But there are tens
> of questions at the same or even higher level. E.g., in preparing the
> companion scoping doc we saw the following questions:
> >
> > * How to test GPU support on Jenkins?
> > * Does the solution proposed also work for FPGA? What are the diffs?
> > * How to make standalone workers auto-discover GPU resources?
> > * Do we want to allow users to request GPU resources in Pandas UDF?
> > * How does user pass the GPU requests to K8s, spark-submit command-line
> or pod template?
> > * Do we create a separate queue for GPU task scheduling so it doesn't
> cause regression on normal jobs?
> > * How to monitor the utilization of GPU? At what levels?
> > * Do we want to support GPU-backed physical operators?
> > * Do we allow users to request both non-default number of CPUs and GPUs?
> > * ...
> >
> > IMHO, we cannot nor we should answer questions at this level in this
> vote. The vote is majorly on whether we should make Spark accelerator-aware
> to help unify big data and AI solutions, specifically whether Spark should
> provide proper support to deep learning model training and inference where
> accelerators are essential. My +1 vote is based on the following logic:
> >
> > * It is important for Spark to become the de facto solution in
> connecting big data and AI.
> > * The work is doable given the design sketch and the early
> investigation/scoping.
> >
> > To me, "-1" means either it is not important for Spark to support such
> use cases or we certainly cannot afford to implement such support. This is
> my understanding of the SPIP and the vote. It would be great if you can
> elaborate what changes you want to make or what answers you want to see.
> >
>

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

Reply via email to