Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-26 Thread Mark Hamstra
Yes, I do expect that the application-level approach outlined in this SPIP
will be sufficiently useful to be worth doing despite any concerns about it
not being ideal. My concern is not just about this design, however. It
feels to me like we are running into limitations of the current Spark
scheduler and that what is really needed is a deeper redesign in order to
be able to cleanly handle new or anticipated requirements like barrier mode
scheduling, GPUs, FPGAs, other domain specific resources, FaaS/serverless,
etc. Instead, what we are getting is layers of clever hacks to sort of make
the current scheduler do new things. The current scheduler was already too
complicated and murky for our own good, and these new grafts tend to make
that worse.

Unfortunately, I can't currently commit to trying to drive such a New
Scheduler effort, and I don't know anyone who can. We also can't
conceivably do something along these lines in Spark 3.0.0 -- there's just
not enough time even if other resources were available; so I don't have a
clear idea about the way forward. I am concerned, though, that scheduler
development isn't currently in very good shape and doesn't have a
better-looking future.  That is not at all intended as a slight on those
who are making contributions now after most of us who used to be more
active haven't been able to continue to be: current contributions are much
appreciated, they're just not enough -- which is not the fault of anyone
currently contributing. I've wandered out of the context of this SPIP, I
know. I'll at least +0 this SPIP, but I also couldn't let my concerns go
unvoiced.

On Mon, Mar 25, 2019 at 8:32 PM Xiangrui Meng  wrote:

>
>
> On Mon, Mar 25, 2019 at 8:07 PM Mark Hamstra 
> wrote:
>
>> Maybe.
>>
>> And I expect that we will end up doing something based on spark.task.cpus
>> in the short term. I'd just rather that this SPIP not make it look like
>> this is the way things should ideally be done. I'd prefer that we be quite
>> explicit in recognizing that this approach is a significant compromise, and
>> I'd like to see at least some references to the beginning of serious
>> longer-term efforts to do something better in a deeper re-design of
>> resource scheduling.
>>
>
> It is also a feature I desire as a user. How about suggesting it as a
> future work in the SPIP? It certainly requires someone who fully
> understands Spark scheduler to drive. Shall we start with a Spark JIRA? I
> don't know much about scheduler like you do, but I can speak for DL use
> cases. Maybe we just view it from different angles. To you
> application-level request is a significant compromise. To me it provides a
> major milestone that brings GPU to Spark workload. I know many users who
> tried to do DL on Spark ended up doing hacks here and there, huge pain. The
> scope covered by the current SPIP makes those users much happier. Tom and
> Andy from NVIDIA are certainly more calibrated on the usefulness of the
> current proposal.
>
>
>>
>> On Mon, Mar 25, 2019 at 7:39 PM Xiangrui Meng 
>> wrote:
>>
>>> There are certainly use cases where different stages require different
>>> number of CPUs or GPUs under an optimal setting. I don't think anyone
>>> disagrees that ideally users should be able to do it. We are just dealing
>>> with typical engineering trade-offs and see how we break it down into
>>> smaller ones. I think it is fair to treat the task-level resource request
>>> as a separate feature here because it also applies to CPUs alone without
>>> GPUs, as Tom mentioned above. But having "spark.task.cpus" only for many
>>> years Spark is still able to cover many many use cases. Otherwise we
>>> shouldn't see many Spark users around now. Here we just apply similar
>>> arguments to GPUs.
>>>
>>> Initially, I was the person who really wanted task-level requests
>>> because it is ideal. In an offline discussion, Andy Feng pointed out an
>>> application-level setting should fit common deep learning training and
>>> inference cases and it greatly simplifies necessary changes required to
>>> Spark job scheduler. With Imran's feedback to the initial design sketch,
>>> the application-level approach became my first choice because it is still
>>> very valuable but much less risky. If a feature brings great value to
>>> users, we should add it even it is not ideal.
>>>
>>> Back to the default value discussion, let's forget GPUs and only
>>> consider CPUs. Would an application-level default number of CPU cores
>>> disappear if we added task-level requests? If yes, does it mean that users
>>> have to explicitly state the resource requirements for every single stage?
>>> It is tedious to do and who do not fully understand the impact would
>>> probably do it wrong and waste even more resources. Then how many cores
>>> each task should use if user didn't specify it? I do see "spark.task.cpus"
>>> is the answer here. The point I want to make is that "spark.task.cpus",
>>> though less ideal, is still needed when we 

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-26 Thread Imran Rashid
+1 on the updated SPIP

I agree with all of Mark's concerns, that eventually we want some way for
users to express per-task constraints -- but I feel like this is a still a
reasonable step forward.

In the meantime, users will either write small spark applications, which
just do the steps which need gpus, and then run separate spark applications
which don't, with something external to orchestrate that pipeline; or
they'll run one giant application, which utilizes resources really poorly.
After we have task-specific constraints, both will still work, but there
would be motivation to tune the giant application.  And plenty of users
might still want to write small spark applications that use gpus, and this
would continue to help them out, without having to worry about the
complexity of per-task constraints.

On Tue, Mar 26, 2019 at 12:33 AM Xingbo Jiang  wrote:

> +1 on the updated SPIP
>
> Xingbo Jiang  于2019年3月26日周二 下午1:32写道:
>
>> Hi all,
>>
>> Now we have had a few discussions over the updated SPIP, we also updated
>> the SPIP addressing new feedbacks from some committers. IMO the SPIP is
>> ready for another round of vote now.
>> On the updated SPIP, we currently have two +1s (from Tom and Xiangrui),
>> everyone else please vote again.
>>
>> The vote will be up for the next 72 hours.
>>
>> Thanks!
>>
>> Xingbo
>>
>> Xiangrui Meng  于2019年3月26日周二 上午11:32写道:
>>
>>>
>>>
>>> On Mon, Mar 25, 2019 at 8:07 PM Mark Hamstra 
>>> wrote:
>>>
 Maybe.

 And I expect that we will end up doing something based on
 spark.task.cpus in the short term. I'd just rather that this SPIP not make
 it look like this is the way things should ideally be done. I'd prefer that
 we be quite explicit in recognizing that this approach is a significant
 compromise, and I'd like to see at least some references to the beginning
 of serious longer-term efforts to do something better in a deeper re-design
 of resource scheduling.

>>>
>>> It is also a feature I desire as a user. How about suggesting it as a
>>> future work in the SPIP? It certainly requires someone who fully
>>> understands Spark scheduler to drive. Shall we start with a Spark JIRA? I
>>> don't know much about scheduler like you do, but I can speak for DL use
>>> cases. Maybe we just view it from different angles. To you
>>> application-level request is a significant compromise. To me it provides a
>>> major milestone that brings GPU to Spark workload. I know many users who
>>> tried to do DL on Spark ended up doing hacks here and there, huge pain. The
>>> scope covered by the current SPIP makes those users much happier. Tom and
>>> Andy from NVIDIA are certainly more calibrated on the usefulness of the
>>> current proposal.
>>>
>>>

 On Mon, Mar 25, 2019 at 7:39 PM Xiangrui Meng 
 wrote:

> There are certainly use cases where different stages require different
> number of CPUs or GPUs under an optimal setting. I don't think anyone
> disagrees that ideally users should be able to do it. We are just dealing
> with typical engineering trade-offs and see how we break it down into
> smaller ones. I think it is fair to treat the task-level resource request
> as a separate feature here because it also applies to CPUs alone without
> GPUs, as Tom mentioned above. But having "spark.task.cpus" only for many
> years Spark is still able to cover many many use cases. Otherwise we
> shouldn't see many Spark users around now. Here we just apply similar
> arguments to GPUs.
>
> Initially, I was the person who really wanted task-level requests
> because it is ideal. In an offline discussion, Andy Feng pointed out an
> application-level setting should fit common deep learning training and
> inference cases and it greatly simplifies necessary changes required to
> Spark job scheduler. With Imran's feedback to the initial design sketch,
> the application-level approach became my first choice because it is still
> very valuable but much less risky. If a feature brings great value to
> users, we should add it even it is not ideal.
>
> Back to the default value discussion, let's forget GPUs and only
> consider CPUs. Would an application-level default number of CPU cores
> disappear if we added task-level requests? If yes, does it mean that users
> have to explicitly state the resource requirements for every single stage?
> It is tedious to do and who do not fully understand the impact would
> probably do it wrong and waste even more resources. Then how many cores
> each task should use if user didn't specify it? I do see "spark.task.cpus"
> is the answer here. The point I want to make is that "spark.task.cpus",
> though less ideal, is still needed when we have task-level requests for
> CPUs.
>
> On Mon, Mar 25, 2019 at 6:46 PM Mark Hamstra 
> wrote:
>
>> I remain unconvinced that a default configuration 

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-25 Thread Xingbo Jiang
+1 on the updated SPIP

Xingbo Jiang  于2019年3月26日周二 下午1:32写道:

> Hi all,
>
> Now we have had a few discussions over the updated SPIP, we also updated
> the SPIP addressing new feedbacks from some committers. IMO the SPIP is
> ready for another round of vote now.
> On the updated SPIP, we currently have two +1s (from Tom and Xiangrui),
> everyone else please vote again.
>
> The vote will be up for the next 72 hours.
>
> Thanks!
>
> Xingbo
>
> Xiangrui Meng  于2019年3月26日周二 上午11:32写道:
>
>>
>>
>> On Mon, Mar 25, 2019 at 8:07 PM Mark Hamstra 
>> wrote:
>>
>>> Maybe.
>>>
>>> And I expect that we will end up doing something based on
>>> spark.task.cpus in the short term. I'd just rather that this SPIP not make
>>> it look like this is the way things should ideally be done. I'd prefer that
>>> we be quite explicit in recognizing that this approach is a significant
>>> compromise, and I'd like to see at least some references to the beginning
>>> of serious longer-term efforts to do something better in a deeper re-design
>>> of resource scheduling.
>>>
>>
>> It is also a feature I desire as a user. How about suggesting it as a
>> future work in the SPIP? It certainly requires someone who fully
>> understands Spark scheduler to drive. Shall we start with a Spark JIRA? I
>> don't know much about scheduler like you do, but I can speak for DL use
>> cases. Maybe we just view it from different angles. To you
>> application-level request is a significant compromise. To me it provides a
>> major milestone that brings GPU to Spark workload. I know many users who
>> tried to do DL on Spark ended up doing hacks here and there, huge pain. The
>> scope covered by the current SPIP makes those users much happier. Tom and
>> Andy from NVIDIA are certainly more calibrated on the usefulness of the
>> current proposal.
>>
>>
>>>
>>> On Mon, Mar 25, 2019 at 7:39 PM Xiangrui Meng 
>>> wrote:
>>>
 There are certainly use cases where different stages require different
 number of CPUs or GPUs under an optimal setting. I don't think anyone
 disagrees that ideally users should be able to do it. We are just dealing
 with typical engineering trade-offs and see how we break it down into
 smaller ones. I think it is fair to treat the task-level resource request
 as a separate feature here because it also applies to CPUs alone without
 GPUs, as Tom mentioned above. But having "spark.task.cpus" only for many
 years Spark is still able to cover many many use cases. Otherwise we
 shouldn't see many Spark users around now. Here we just apply similar
 arguments to GPUs.

 Initially, I was the person who really wanted task-level requests
 because it is ideal. In an offline discussion, Andy Feng pointed out an
 application-level setting should fit common deep learning training and
 inference cases and it greatly simplifies necessary changes required to
 Spark job scheduler. With Imran's feedback to the initial design sketch,
 the application-level approach became my first choice because it is still
 very valuable but much less risky. If a feature brings great value to
 users, we should add it even it is not ideal.

 Back to the default value discussion, let's forget GPUs and only
 consider CPUs. Would an application-level default number of CPU cores
 disappear if we added task-level requests? If yes, does it mean that users
 have to explicitly state the resource requirements for every single stage?
 It is tedious to do and who do not fully understand the impact would
 probably do it wrong and waste even more resources. Then how many cores
 each task should use if user didn't specify it? I do see "spark.task.cpus"
 is the answer here. The point I want to make is that "spark.task.cpus",
 though less ideal, is still needed when we have task-level requests for
 CPUs.

 On Mon, Mar 25, 2019 at 6:46 PM Mark Hamstra 
 wrote:

> I remain unconvinced that a default configuration at the application
> level makes sense even in that case. There may be some applications where
> you know a priori that almost all the tasks for all the stages for all the
> jobs will need some fixed number of gpus; but I think the more common 
> cases
> will be dynamic configuration at the job or stage level. Stage level could
> have a lot of overlap with barrier mode scheduling -- barrier mode stages
> having a need for an inter-task channel resource, gpu-ified stages needing
> gpu resources, etc. Have I mentioned that I'm not a fan of the current
> barrier mode API, Xiangrui? :) Yes, I know: "Show me something better."
>
> On Mon, Mar 25, 2019 at 3:55 PM Xiangrui Meng 
> wrote:
>
>> Say if we support per-task resource requests in the future, it would
>> be still inconvenient for users to declare the resource requirements for
>> every single task/stage. So there must be some default 

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-25 Thread Xingbo Jiang
Hi all,

Now we have had a few discussions over the updated SPIP, we also updated
the SPIP addressing new feedbacks from some committers. IMO the SPIP is
ready for another round of vote now.
On the updated SPIP, we currently have two +1s (from Tom and Xiangrui),
everyone else please vote again.

The vote will be up for the next 72 hours.

Thanks!

Xingbo

Xiangrui Meng  于2019年3月26日周二 上午11:32写道:

>
>
> On Mon, Mar 25, 2019 at 8:07 PM Mark Hamstra 
> wrote:
>
>> Maybe.
>>
>> And I expect that we will end up doing something based on spark.task.cpus
>> in the short term. I'd just rather that this SPIP not make it look like
>> this is the way things should ideally be done. I'd prefer that we be quite
>> explicit in recognizing that this approach is a significant compromise, and
>> I'd like to see at least some references to the beginning of serious
>> longer-term efforts to do something better in a deeper re-design of
>> resource scheduling.
>>
>
> It is also a feature I desire as a user. How about suggesting it as a
> future work in the SPIP? It certainly requires someone who fully
> understands Spark scheduler to drive. Shall we start with a Spark JIRA? I
> don't know much about scheduler like you do, but I can speak for DL use
> cases. Maybe we just view it from different angles. To you
> application-level request is a significant compromise. To me it provides a
> major milestone that brings GPU to Spark workload. I know many users who
> tried to do DL on Spark ended up doing hacks here and there, huge pain. The
> scope covered by the current SPIP makes those users much happier. Tom and
> Andy from NVIDIA are certainly more calibrated on the usefulness of the
> current proposal.
>
>
>>
>> On Mon, Mar 25, 2019 at 7:39 PM Xiangrui Meng 
>> wrote:
>>
>>> There are certainly use cases where different stages require different
>>> number of CPUs or GPUs under an optimal setting. I don't think anyone
>>> disagrees that ideally users should be able to do it. We are just dealing
>>> with typical engineering trade-offs and see how we break it down into
>>> smaller ones. I think it is fair to treat the task-level resource request
>>> as a separate feature here because it also applies to CPUs alone without
>>> GPUs, as Tom mentioned above. But having "spark.task.cpus" only for many
>>> years Spark is still able to cover many many use cases. Otherwise we
>>> shouldn't see many Spark users around now. Here we just apply similar
>>> arguments to GPUs.
>>>
>>> Initially, I was the person who really wanted task-level requests
>>> because it is ideal. In an offline discussion, Andy Feng pointed out an
>>> application-level setting should fit common deep learning training and
>>> inference cases and it greatly simplifies necessary changes required to
>>> Spark job scheduler. With Imran's feedback to the initial design sketch,
>>> the application-level approach became my first choice because it is still
>>> very valuable but much less risky. If a feature brings great value to
>>> users, we should add it even it is not ideal.
>>>
>>> Back to the default value discussion, let's forget GPUs and only
>>> consider CPUs. Would an application-level default number of CPU cores
>>> disappear if we added task-level requests? If yes, does it mean that users
>>> have to explicitly state the resource requirements for every single stage?
>>> It is tedious to do and who do not fully understand the impact would
>>> probably do it wrong and waste even more resources. Then how many cores
>>> each task should use if user didn't specify it? I do see "spark.task.cpus"
>>> is the answer here. The point I want to make is that "spark.task.cpus",
>>> though less ideal, is still needed when we have task-level requests for
>>> CPUs.
>>>
>>> On Mon, Mar 25, 2019 at 6:46 PM Mark Hamstra 
>>> wrote:
>>>
 I remain unconvinced that a default configuration at the application
 level makes sense even in that case. There may be some applications where
 you know a priori that almost all the tasks for all the stages for all the
 jobs will need some fixed number of gpus; but I think the more common cases
 will be dynamic configuration at the job or stage level. Stage level could
 have a lot of overlap with barrier mode scheduling -- barrier mode stages
 having a need for an inter-task channel resource, gpu-ified stages needing
 gpu resources, etc. Have I mentioned that I'm not a fan of the current
 barrier mode API, Xiangrui? :) Yes, I know: "Show me something better."

 On Mon, Mar 25, 2019 at 3:55 PM Xiangrui Meng  wrote:

> Say if we support per-task resource requests in the future, it would
> be still inconvenient for users to declare the resource requirements for
> every single task/stage. So there must be some default values defined
> somewhere for task resource requirements. "spark.task.cpus" and
> "spark.task.accelerator.gpu.count" could serve for this purpose without
> introducing 

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-25 Thread Xiangrui Meng
On Mon, Mar 25, 2019 at 8:07 PM Mark Hamstra 
wrote:

> Maybe.
>
> And I expect that we will end up doing something based on spark.task.cpus
> in the short term. I'd just rather that this SPIP not make it look like
> this is the way things should ideally be done. I'd prefer that we be quite
> explicit in recognizing that this approach is a significant compromise, and
> I'd like to see at least some references to the beginning of serious
> longer-term efforts to do something better in a deeper re-design of
> resource scheduling.
>

It is also a feature I desire as a user. How about suggesting it as a
future work in the SPIP? It certainly requires someone who fully
understands Spark scheduler to drive. Shall we start with a Spark JIRA? I
don't know much about scheduler like you do, but I can speak for DL use
cases. Maybe we just view it from different angles. To you
application-level request is a significant compromise. To me it provides a
major milestone that brings GPU to Spark workload. I know many users who
tried to do DL on Spark ended up doing hacks here and there, huge pain. The
scope covered by the current SPIP makes those users much happier. Tom and
Andy from NVIDIA are certainly more calibrated on the usefulness of the
current proposal.


>
> On Mon, Mar 25, 2019 at 7:39 PM Xiangrui Meng  wrote:
>
>> There are certainly use cases where different stages require different
>> number of CPUs or GPUs under an optimal setting. I don't think anyone
>> disagrees that ideally users should be able to do it. We are just dealing
>> with typical engineering trade-offs and see how we break it down into
>> smaller ones. I think it is fair to treat the task-level resource request
>> as a separate feature here because it also applies to CPUs alone without
>> GPUs, as Tom mentioned above. But having "spark.task.cpus" only for many
>> years Spark is still able to cover many many use cases. Otherwise we
>> shouldn't see many Spark users around now. Here we just apply similar
>> arguments to GPUs.
>>
>> Initially, I was the person who really wanted task-level requests because
>> it is ideal. In an offline discussion, Andy Feng pointed out an
>> application-level setting should fit common deep learning training and
>> inference cases and it greatly simplifies necessary changes required to
>> Spark job scheduler. With Imran's feedback to the initial design sketch,
>> the application-level approach became my first choice because it is still
>> very valuable but much less risky. If a feature brings great value to
>> users, we should add it even it is not ideal.
>>
>> Back to the default value discussion, let's forget GPUs and only consider
>> CPUs. Would an application-level default number of CPU cores disappear if
>> we added task-level requests? If yes, does it mean that users have to
>> explicitly state the resource requirements for every single stage? It is
>> tedious to do and who do not fully understand the impact would probably do
>> it wrong and waste even more resources. Then how many cores each task
>> should use if user didn't specify it? I do see "spark.task.cpus" is the
>> answer here. The point I want to make is that "spark.task.cpus", though
>> less ideal, is still needed when we have task-level requests for CPUs.
>>
>> On Mon, Mar 25, 2019 at 6:46 PM Mark Hamstra 
>> wrote:
>>
>>> I remain unconvinced that a default configuration at the application
>>> level makes sense even in that case. There may be some applications where
>>> you know a priori that almost all the tasks for all the stages for all the
>>> jobs will need some fixed number of gpus; but I think the more common cases
>>> will be dynamic configuration at the job or stage level. Stage level could
>>> have a lot of overlap with barrier mode scheduling -- barrier mode stages
>>> having a need for an inter-task channel resource, gpu-ified stages needing
>>> gpu resources, etc. Have I mentioned that I'm not a fan of the current
>>> barrier mode API, Xiangrui? :) Yes, I know: "Show me something better."
>>>
>>> On Mon, Mar 25, 2019 at 3:55 PM Xiangrui Meng  wrote:
>>>
 Say if we support per-task resource requests in the future, it would be
 still inconvenient for users to declare the resource requirements for every
 single task/stage. So there must be some default values defined somewhere
 for task resource requirements. "spark.task.cpus" and
 "spark.task.accelerator.gpu.count" could serve for this purpose without
 introducing breaking changes. So I'm +1 on the updated SPIP. It fairly
 separated necessary GPU support from risky scheduler changes.

 On Mon, Mar 25, 2019 at 8:39 AM Mark Hamstra 
 wrote:

> Of course there is an issue of the perfect becoming the enemy of the
> good, so I can understand the impulse to get something done. I am left
> wanting, however, at least something more of a roadmap to a task-level
> future than just a vague "we may choose to do something more in the
> 

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-25 Thread Mark Hamstra
Maybe.

And I expect that we will end up doing something based on spark.task.cpus
in the short term. I'd just rather that this SPIP not make it look like
this is the way things should ideally be done. I'd prefer that we be quite
explicit in recognizing that this approach is a significant compromise, and
I'd like to see at least some references to the beginning of serious
longer-term efforts to do something better in a deeper re-design of
resource scheduling.

On Mon, Mar 25, 2019 at 7:39 PM Xiangrui Meng  wrote:

> There are certainly use cases where different stages require different
> number of CPUs or GPUs under an optimal setting. I don't think anyone
> disagrees that ideally users should be able to do it. We are just dealing
> with typical engineering trade-offs and see how we break it down into
> smaller ones. I think it is fair to treat the task-level resource request
> as a separate feature here because it also applies to CPUs alone without
> GPUs, as Tom mentioned above. But having "spark.task.cpus" only for many
> years Spark is still able to cover many many use cases. Otherwise we
> shouldn't see many Spark users around now. Here we just apply similar
> arguments to GPUs.
>
> Initially, I was the person who really wanted task-level requests because
> it is ideal. In an offline discussion, Andy Feng pointed out an
> application-level setting should fit common deep learning training and
> inference cases and it greatly simplifies necessary changes required to
> Spark job scheduler. With Imran's feedback to the initial design sketch,
> the application-level approach became my first choice because it is still
> very valuable but much less risky. If a feature brings great value to
> users, we should add it even it is not ideal.
>
> Back to the default value discussion, let's forget GPUs and only consider
> CPUs. Would an application-level default number of CPU cores disappear if
> we added task-level requests? If yes, does it mean that users have to
> explicitly state the resource requirements for every single stage? It is
> tedious to do and who do not fully understand the impact would probably do
> it wrong and waste even more resources. Then how many cores each task
> should use if user didn't specify it? I do see "spark.task.cpus" is the
> answer here. The point I want to make is that "spark.task.cpus", though
> less ideal, is still needed when we have task-level requests for CPUs.
>
> On Mon, Mar 25, 2019 at 6:46 PM Mark Hamstra 
> wrote:
>
>> I remain unconvinced that a default configuration at the application
>> level makes sense even in that case. There may be some applications where
>> you know a priori that almost all the tasks for all the stages for all the
>> jobs will need some fixed number of gpus; but I think the more common cases
>> will be dynamic configuration at the job or stage level. Stage level could
>> have a lot of overlap with barrier mode scheduling -- barrier mode stages
>> having a need for an inter-task channel resource, gpu-ified stages needing
>> gpu resources, etc. Have I mentioned that I'm not a fan of the current
>> barrier mode API, Xiangrui? :) Yes, I know: "Show me something better."
>>
>> On Mon, Mar 25, 2019 at 3:55 PM Xiangrui Meng  wrote:
>>
>>> Say if we support per-task resource requests in the future, it would be
>>> still inconvenient for users to declare the resource requirements for every
>>> single task/stage. So there must be some default values defined somewhere
>>> for task resource requirements. "spark.task.cpus" and
>>> "spark.task.accelerator.gpu.count" could serve for this purpose without
>>> introducing breaking changes. So I'm +1 on the updated SPIP. It fairly
>>> separated necessary GPU support from risky scheduler changes.
>>>
>>> On Mon, Mar 25, 2019 at 8:39 AM Mark Hamstra 
>>> wrote:
>>>
 Of course there is an issue of the perfect becoming the enemy of the
 good, so I can understand the impulse to get something done. I am left
 wanting, however, at least something more of a roadmap to a task-level
 future than just a vague "we may choose to do something more in the
 future." At the risk of repeating myself, I don't think the
 existing spark.task.cpus is very good, and I think that building more on
 that weak foundation without a more clear path or stated intention to move
 to something better runs the risk of leaving Spark stuck in a bad
 neighborhood.

 On Thu, Mar 21, 2019 at 10:10 AM Tom Graves 
 wrote:

> While I agree with you that it would be ideal to have the task level
> resources and do a deeper redesign for the scheduler, I think that can be 
> a
> separate enhancement like was discussed earlier in the thread. That 
> feature
> is useful without GPU's.  I do realize that they overlap some but I think
> the changes for this will be minimal to the scheduler, follow existing
> conventions, and it is an improvement over what we have now. I know many

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-25 Thread Xiangrui Meng
There are certainly use cases where different stages require different
number of CPUs or GPUs under an optimal setting. I don't think anyone
disagrees that ideally users should be able to do it. We are just dealing
with typical engineering trade-offs and see how we break it down into
smaller ones. I think it is fair to treat the task-level resource request
as a separate feature here because it also applies to CPUs alone without
GPUs, as Tom mentioned above. But having "spark.task.cpus" only for many
years Spark is still able to cover many many use cases. Otherwise we
shouldn't see many Spark users around now. Here we just apply similar
arguments to GPUs.

Initially, I was the person who really wanted task-level requests because
it is ideal. In an offline discussion, Andy Feng pointed out an
application-level setting should fit common deep learning training and
inference cases and it greatly simplifies necessary changes required to
Spark job scheduler. With Imran's feedback to the initial design sketch,
the application-level approach became my first choice because it is still
very valuable but much less risky. If a feature brings great value to
users, we should add it even it is not ideal.

Back to the default value discussion, let's forget GPUs and only consider
CPUs. Would an application-level default number of CPU cores disappear if
we added task-level requests? If yes, does it mean that users have to
explicitly state the resource requirements for every single stage? It is
tedious to do and who do not fully understand the impact would probably do
it wrong and waste even more resources. Then how many cores each task
should use if user didn't specify it? I do see "spark.task.cpus" is the
answer here. The point I want to make is that "spark.task.cpus", though
less ideal, is still needed when we have task-level requests for CPUs.

On Mon, Mar 25, 2019 at 6:46 PM Mark Hamstra 
wrote:

> I remain unconvinced that a default configuration at the application level
> makes sense even in that case. There may be some applications where you
> know a priori that almost all the tasks for all the stages for all the jobs
> will need some fixed number of gpus; but I think the more common cases will
> be dynamic configuration at the job or stage level. Stage level could have
> a lot of overlap with barrier mode scheduling -- barrier mode stages having
> a need for an inter-task channel resource, gpu-ified stages needing gpu
> resources, etc. Have I mentioned that I'm not a fan of the current barrier
> mode API, Xiangrui? :) Yes, I know: "Show me something better."
>
> On Mon, Mar 25, 2019 at 3:55 PM Xiangrui Meng  wrote:
>
>> Say if we support per-task resource requests in the future, it would be
>> still inconvenient for users to declare the resource requirements for every
>> single task/stage. So there must be some default values defined somewhere
>> for task resource requirements. "spark.task.cpus" and
>> "spark.task.accelerator.gpu.count" could serve for this purpose without
>> introducing breaking changes. So I'm +1 on the updated SPIP. It fairly
>> separated necessary GPU support from risky scheduler changes.
>>
>> On Mon, Mar 25, 2019 at 8:39 AM Mark Hamstra 
>> wrote:
>>
>>> Of course there is an issue of the perfect becoming the enemy of the
>>> good, so I can understand the impulse to get something done. I am left
>>> wanting, however, at least something more of a roadmap to a task-level
>>> future than just a vague "we may choose to do something more in the
>>> future." At the risk of repeating myself, I don't think the
>>> existing spark.task.cpus is very good, and I think that building more on
>>> that weak foundation without a more clear path or stated intention to move
>>> to something better runs the risk of leaving Spark stuck in a bad
>>> neighborhood.
>>>
>>> On Thu, Mar 21, 2019 at 10:10 AM Tom Graves 
>>> wrote:
>>>
 While I agree with you that it would be ideal to have the task level
 resources and do a deeper redesign for the scheduler, I think that can be a
 separate enhancement like was discussed earlier in the thread. That feature
 is useful without GPU's.  I do realize that they overlap some but I think
 the changes for this will be minimal to the scheduler, follow existing
 conventions, and it is an improvement over what we have now. I know many
 users will be happy to have this even without the task level scheduling as
 many of the conventions used now to scheduler gpus can easily be broken by
 one bad user. I think from the user point of view this gives many users
 an improvement and we can extend it later to cover more use cases.

 Tom
 On Thursday, March 21, 2019, 9:15:05 AM PDT, Mark Hamstra <
 m...@clearstorydata.com> wrote:


 I understand the application-level, static, global nature
 of spark.task.accelerator.gpu.count and its similarity to the
 existing spark.task.cpus, but to me this feels like extending a 

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-25 Thread Mark Hamstra
I remain unconvinced that a default configuration at the application level
makes sense even in that case. There may be some applications where you
know a priori that almost all the tasks for all the stages for all the jobs
will need some fixed number of gpus; but I think the more common cases will
be dynamic configuration at the job or stage level. Stage level could have
a lot of overlap with barrier mode scheduling -- barrier mode stages having
a need for an inter-task channel resource, gpu-ified stages needing gpu
resources, etc. Have I mentioned that I'm not a fan of the current barrier
mode API, Xiangrui? :) Yes, I know: "Show me something better."

On Mon, Mar 25, 2019 at 3:55 PM Xiangrui Meng  wrote:

> Say if we support per-task resource requests in the future, it would be
> still inconvenient for users to declare the resource requirements for every
> single task/stage. So there must be some default values defined somewhere
> for task resource requirements. "spark.task.cpus" and
> "spark.task.accelerator.gpu.count" could serve for this purpose without
> introducing breaking changes. So I'm +1 on the updated SPIP. It fairly
> separated necessary GPU support from risky scheduler changes.
>
> On Mon, Mar 25, 2019 at 8:39 AM Mark Hamstra 
> wrote:
>
>> Of course there is an issue of the perfect becoming the enemy of the
>> good, so I can understand the impulse to get something done. I am left
>> wanting, however, at least something more of a roadmap to a task-level
>> future than just a vague "we may choose to do something more in the
>> future." At the risk of repeating myself, I don't think the
>> existing spark.task.cpus is very good, and I think that building more on
>> that weak foundation without a more clear path or stated intention to move
>> to something better runs the risk of leaving Spark stuck in a bad
>> neighborhood.
>>
>> On Thu, Mar 21, 2019 at 10:10 AM Tom Graves  wrote:
>>
>>> While I agree with you that it would be ideal to have the task level
>>> resources and do a deeper redesign for the scheduler, I think that can be a
>>> separate enhancement like was discussed earlier in the thread. That feature
>>> is useful without GPU's.  I do realize that they overlap some but I think
>>> the changes for this will be minimal to the scheduler, follow existing
>>> conventions, and it is an improvement over what we have now. I know many
>>> users will be happy to have this even without the task level scheduling as
>>> many of the conventions used now to scheduler gpus can easily be broken by
>>> one bad user. I think from the user point of view this gives many users
>>> an improvement and we can extend it later to cover more use cases.
>>>
>>> Tom
>>> On Thursday, March 21, 2019, 9:15:05 AM PDT, Mark Hamstra <
>>> m...@clearstorydata.com> wrote:
>>>
>>>
>>> I understand the application-level, static, global nature
>>> of spark.task.accelerator.gpu.count and its similarity to the
>>> existing spark.task.cpus, but to me this feels like extending a weakness of
>>> Spark's scheduler, not building on its strengths. That is because I
>>> consider binding the number of cores for each task to an application
>>> configuration to be far from optimal. This is already far from the desired
>>> behavior when an application is running a wide range of jobs (as in a
>>> generic job-runner style of Spark application), some of which require or
>>> can benefit from multi-core tasks, others of which will just waste the
>>> extra cores allocated to their tasks. Ideally, the number of cores
>>> allocated to tasks would get pushed to an even finer granularity that jobs,
>>> and instead being a per-stage property.
>>>
>>> Now, of course, making allocation of general-purpose cores and
>>> domain-specific resources work in this finer-grained fashion is a lot more
>>> work than just trying to extend the existing resource allocation mechanisms
>>> to handle domain-specific resources, but it does feel to me like we should
>>> at least be considering doing that deeper redesign.
>>>
>>> On Thu, Mar 21, 2019 at 7:33 AM Tom Graves 
>>> wrote:
>>>
>>> Tthe proposal here is that all your resources are static and the gpu per
>>> task config is global per application, meaning you ask for a certain amount
>>> memory, cpu, GPUs for every executor up front just like you do today and
>>> every executor you get is that size.  This means that both static or
>>> dynamic allocation still work without explicitly adding more logic at this
>>> point. Since the config for gpu per task is global it means every task you
>>> want will need a certain ratio of cpu to gpu.  Since that is a global you
>>> can't really have the scenario you mentioned, all tasks are assuming to
>>> need GPU.  For instance. I request 5 cores, 2 GPUs, set 1 gpu per task for
>>> each executor.  That means that I could only run 2 tasks and 3 cores would
>>> be wasted.  The stage/task level configuration of resources was removed and
>>> is something we can do in a 

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-25 Thread Xiangrui Meng
Say if we support per-task resource requests in the future, it would be
still inconvenient for users to declare the resource requirements for every
single task/stage. So there must be some default values defined somewhere
for task resource requirements. "spark.task.cpus" and
"spark.task.accelerator.gpu.count" could serve for this purpose without
introducing breaking changes. So I'm +1 on the updated SPIP. It fairly
separated necessary GPU support from risky scheduler changes.

On Mon, Mar 25, 2019 at 8:39 AM Mark Hamstra 
wrote:

> Of course there is an issue of the perfect becoming the enemy of the good,
> so I can understand the impulse to get something done. I am left wanting,
> however, at least something more of a roadmap to a task-level future than
> just a vague "we may choose to do something more in the future." At the
> risk of repeating myself, I don't think the existing spark.task.cpus is
> very good, and I think that building more on that weak foundation without a
> more clear path or stated intention to move to something better runs the
> risk of leaving Spark stuck in a bad neighborhood.
>
> On Thu, Mar 21, 2019 at 10:10 AM Tom Graves  wrote:
>
>> While I agree with you that it would be ideal to have the task level
>> resources and do a deeper redesign for the scheduler, I think that can be a
>> separate enhancement like was discussed earlier in the thread. That feature
>> is useful without GPU's.  I do realize that they overlap some but I think
>> the changes for this will be minimal to the scheduler, follow existing
>> conventions, and it is an improvement over what we have now. I know many
>> users will be happy to have this even without the task level scheduling as
>> many of the conventions used now to scheduler gpus can easily be broken by
>> one bad user. I think from the user point of view this gives many users
>> an improvement and we can extend it later to cover more use cases.
>>
>> Tom
>> On Thursday, March 21, 2019, 9:15:05 AM PDT, Mark Hamstra <
>> m...@clearstorydata.com> wrote:
>>
>>
>> I understand the application-level, static, global nature
>> of spark.task.accelerator.gpu.count and its similarity to the
>> existing spark.task.cpus, but to me this feels like extending a weakness of
>> Spark's scheduler, not building on its strengths. That is because I
>> consider binding the number of cores for each task to an application
>> configuration to be far from optimal. This is already far from the desired
>> behavior when an application is running a wide range of jobs (as in a
>> generic job-runner style of Spark application), some of which require or
>> can benefit from multi-core tasks, others of which will just waste the
>> extra cores allocated to their tasks. Ideally, the number of cores
>> allocated to tasks would get pushed to an even finer granularity that jobs,
>> and instead being a per-stage property.
>>
>> Now, of course, making allocation of general-purpose cores and
>> domain-specific resources work in this finer-grained fashion is a lot more
>> work than just trying to extend the existing resource allocation mechanisms
>> to handle domain-specific resources, but it does feel to me like we should
>> at least be considering doing that deeper redesign.
>>
>> On Thu, Mar 21, 2019 at 7:33 AM Tom Graves 
>> wrote:
>>
>> Tthe proposal here is that all your resources are static and the gpu per
>> task config is global per application, meaning you ask for a certain amount
>> memory, cpu, GPUs for every executor up front just like you do today and
>> every executor you get is that size.  This means that both static or
>> dynamic allocation still work without explicitly adding more logic at this
>> point. Since the config for gpu per task is global it means every task you
>> want will need a certain ratio of cpu to gpu.  Since that is a global you
>> can't really have the scenario you mentioned, all tasks are assuming to
>> need GPU.  For instance. I request 5 cores, 2 GPUs, set 1 gpu per task for
>> each executor.  That means that I could only run 2 tasks and 3 cores would
>> be wasted.  The stage/task level configuration of resources was removed and
>> is something we can do in a separate SPIP.
>> We thought erroring would make it more obvious to the user.  We could
>> change this to a warning if everyone thinks that is better but I personally
>> like the error until we can implement the per lower level per stage
>> configuration.
>>
>> Tom
>>
>> On Thursday, March 21, 2019, 1:45:01 AM PDT, Marco Gaido <
>> marcogaid...@gmail.com> wrote:
>>
>>
>> Thanks for this SPIP.
>> I cannot comment on the docs, but just wanted to highlight one thing. In
>> page 5 of the SPIP, when we talk about DRA, I see:
>>
>> "For instance, if each executor consists 4 CPUs and 2 GPUs, and each
>> task requires 1 CPU and 1GPU, then we shall throw an error on application
>> start because we shall always have at least 2 idle CPUs per executor"
>>
>> I am not sure this is a correct behavior. We 

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-25 Thread Mark Hamstra
Of course there is an issue of the perfect becoming the enemy of the good,
so I can understand the impulse to get something done. I am left wanting,
however, at least something more of a roadmap to a task-level future than
just a vague "we may choose to do something more in the future." At the
risk of repeating myself, I don't think the existing spark.task.cpus is
very good, and I think that building more on that weak foundation without a
more clear path or stated intention to move to something better runs the
risk of leaving Spark stuck in a bad neighborhood.

On Thu, Mar 21, 2019 at 10:10 AM Tom Graves  wrote:

> While I agree with you that it would be ideal to have the task level
> resources and do a deeper redesign for the scheduler, I think that can be a
> separate enhancement like was discussed earlier in the thread. That feature
> is useful without GPU's.  I do realize that they overlap some but I think
> the changes for this will be minimal to the scheduler, follow existing
> conventions, and it is an improvement over what we have now. I know many
> users will be happy to have this even without the task level scheduling as
> many of the conventions used now to scheduler gpus can easily be broken by
> one bad user. I think from the user point of view this gives many users
> an improvement and we can extend it later to cover more use cases.
>
> Tom
> On Thursday, March 21, 2019, 9:15:05 AM PDT, Mark Hamstra <
> m...@clearstorydata.com> wrote:
>
>
> I understand the application-level, static, global nature
> of spark.task.accelerator.gpu.count and its similarity to the
> existing spark.task.cpus, but to me this feels like extending a weakness of
> Spark's scheduler, not building on its strengths. That is because I
> consider binding the number of cores for each task to an application
> configuration to be far from optimal. This is already far from the desired
> behavior when an application is running a wide range of jobs (as in a
> generic job-runner style of Spark application), some of which require or
> can benefit from multi-core tasks, others of which will just waste the
> extra cores allocated to their tasks. Ideally, the number of cores
> allocated to tasks would get pushed to an even finer granularity that jobs,
> and instead being a per-stage property.
>
> Now, of course, making allocation of general-purpose cores and
> domain-specific resources work in this finer-grained fashion is a lot more
> work than just trying to extend the existing resource allocation mechanisms
> to handle domain-specific resources, but it does feel to me like we should
> at least be considering doing that deeper redesign.
>
> On Thu, Mar 21, 2019 at 7:33 AM Tom Graves 
> wrote:
>
> Tthe proposal here is that all your resources are static and the gpu per
> task config is global per application, meaning you ask for a certain amount
> memory, cpu, GPUs for every executor up front just like you do today and
> every executor you get is that size.  This means that both static or
> dynamic allocation still work without explicitly adding more logic at this
> point. Since the config for gpu per task is global it means every task you
> want will need a certain ratio of cpu to gpu.  Since that is a global you
> can't really have the scenario you mentioned, all tasks are assuming to
> need GPU.  For instance. I request 5 cores, 2 GPUs, set 1 gpu per task for
> each executor.  That means that I could only run 2 tasks and 3 cores would
> be wasted.  The stage/task level configuration of resources was removed and
> is something we can do in a separate SPIP.
> We thought erroring would make it more obvious to the user.  We could
> change this to a warning if everyone thinks that is better but I personally
> like the error until we can implement the per lower level per stage
> configuration.
>
> Tom
>
> On Thursday, March 21, 2019, 1:45:01 AM PDT, Marco Gaido <
> marcogaid...@gmail.com> wrote:
>
>
> Thanks for this SPIP.
> I cannot comment on the docs, but just wanted to highlight one thing. In
> page 5 of the SPIP, when we talk about DRA, I see:
>
> "For instance, if each executor consists 4 CPUs and 2 GPUs, and each task
> requires 1 CPU and 1GPU, then we shall throw an error on application start
> because we shall always have at least 2 idle CPUs per executor"
>
> I am not sure this is a correct behavior. We might have tasks requiring
> only CPU running in parallel as well, hence that may make sense. I'd rather
> emit a WARN or something similar. Anyway we just said we will keep GPU
> scheduling on task level out of scope for the moment, right?
>
> Thanks,
> Marco
>
> Il giorno gio 21 mar 2019 alle ore 01:26 Xiangrui Meng <
> m...@databricks.com> ha scritto:
>
> Steve, the initial work would focus on GPUs, but we will keep the
> interfaces general to support other accelerators in the future. This was
> mentioned in the SPIP and draft design.
>
> Imran, you should have comment permission now. Thanks for making a pass! I
> 

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-25 Thread Tom Graves
 +1 on the updated SPIP.
Tom
On Monday, March 18, 2019, 12:56:22 PM CDT, Xingbo Jiang 
 wrote:  
 
 Hi all,
I updated the SPIP doc and stories, I hope it now contains clear scope of the 
changes and enough details for SPIP vote.Please review the updated docs, thanks!
Xiangrui Meng  于2019年3月6日周三 上午8:35写道:

How about letting Xingbo make a major revision to the SPIP doc to make it clear 
what proposed are? I like Felix's suggestion to switch to the new Heilmeier 
template, which helps clarify what are proposed and what are not. Then let's 
review the new SPIP and resume the vote.
On Tue, Mar 5, 2019 at 7:54 AM Imran Rashid  wrote:

OK, I suppose then we are getting bogged down into what a vote on an SPIP means 
then anyway, which I guess we can set aside for now.  With the level of detail 
in this proposal, I feel like there is a reasonable chance I'd still -1 the 
design or implementation.
And the other thing you're implicitly asking the community for is to prioritize 
this feature for continued review and maintenance.  There is already work to be 
done in things like making barrier mode support dynamic allocation 
(SPARK-24942), bugs in failure handling (eg. SPARK-25250), and general 
efficiency of failure handling (eg. SPARK-25341, SPARK-20178).  I'm very 
concerned about getting spread too thin.


But if this is really just a vote on (1) is better gpu support important for 
spark, in some form, in some release? and (2) is it *possible* to do this in a 
safe way?  then I will vote +0.
On Tue, Mar 5, 2019 at 8:25 AM Tom Graves  wrote:

 So to me most of the questions here are implementation/design questions, I've 
had this issue in the past with SPIP's where I expected to have more high level 
design details but was basically told that belongs in the design jira follow 
on. This makes me think we need to revisit what a SPIP really need to contain, 
which should be done in a separate thread.  Note personally I would be for 
having more high level details in it.But the way I read our documentation on a 
SPIP right now that detail is all optional, now maybe we could argue its based 
on what reviewers request, but really perhaps we should make the wording of 
that more required.  thoughts?  We should probably separate that discussion if 
people want to talk about that.
For this SPIP in particular the reason I +1 it is because it came down to 2 
questions:
1) do I think spark should support this -> my answer is yes, I think this would 
improve spark, users have been requesting both better GPUs support and support 
for controlling container requests at a finer granularity for a while.  If 
spark doesn't support this then users may go to something else, so I think it 
we should support it
2) do I think its possible to design and implement it without causing large 
instabilities?   My opinion here again is yes. I agree with Imran and others 
that the scheduler piece needs to be looked at very closely as we have had a 
lot of issues there and that is why I was asking for more details in the design 
jira:  https://issues.apache.org/jira/browse/SPARK-27005.  But I do believe its 
possible to do.
If others have reservations on similar questions then I think we should resolve 
here or take the discussion of what a SPIP is to a different thread and then 
come back to this, thoughts?    
Note there is a high level design for at least the core piece, which is what 
people seem concerned with, already so including it in the SPIP should be 
straight forward.
Tom
On Monday, March 4, 2019, 2:52:43 PM CST, Imran Rashid 
 wrote:  
 
 On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng  wrote:

On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung  wrote:
IMO upfront allocation is less useful. Specifically too expensive for large 
jobs.

This is also an API/design discussion.

I agree with Felix -- this is more than just an API question.  It has a huge 
impact on the complexity of what you're proposing.  You might be proposing big 
changes to a core and brittle part of spark, which is already short of experts.
I don't see any value in having a vote on "does feature X sound cool?"  We have 
to evaluate the potential benefit against the risks the feature brings and the 
continued maintenance cost.  We don't need super low-level details, but we have 
to a sketch of the design to be able to make that tradeoff.  


  

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-21 Thread Tom Graves
 While I agree with you that it would be ideal to have the task level resources 
and do a deeper redesign for the scheduler, I think that can be a separate 
enhancement like was discussed earlier in the thread. That feature is useful 
without GPU's.  I do realize that they overlap some but I think the changes for 
this will be minimal to the scheduler, follow existing conventions, and it is 
an improvement over what we have now. I know many users will be happy to have 
this even without the task level scheduling as many of the conventions used now 
to scheduler gpus can easily be broken by one bad user.     I think from the 
user point of view this gives many users an improvement and we can extend it 
later to cover more use cases. 
TomOn Thursday, March 21, 2019, 9:15:05 AM PDT, Mark Hamstra 
 wrote:  
 
 I understand the application-level, static, global nature of 
spark.task.accelerator.gpu.count and its similarity to the existing 
spark.task.cpus, but to me this feels like extending a weakness of Spark's 
scheduler, not building on its strengths. That is because I consider binding 
the number of cores for each task to an application configuration to be far 
from optimal. This is already far from the desired behavior when an application 
is running a wide range of jobs (as in a generic job-runner style of Spark 
application), some of which require or can benefit from multi-core tasks, 
others of which will just waste the extra cores allocated to their tasks. 
Ideally, the number of cores allocated to tasks would get pushed to an even 
finer granularity that jobs, and instead being a per-stage property.
Now, of course, making allocation of general-purpose cores and domain-specific 
resources work in this finer-grained fashion is a lot more work than just 
trying to extend the existing resource allocation mechanisms to handle 
domain-specific resources, but it does feel to me like we should at least be 
considering doing that deeper redesign.  
On Thu, Mar 21, 2019 at 7:33 AM Tom Graves  wrote:

 Tthe proposal here is that all your resources are static and the gpu per task 
config is global per application, meaning you ask for a certain amount memory, 
cpu, GPUs for every executor up front just like you do today and every executor 
you get is that size.  This means that both static or dynamic allocation still 
work without explicitly adding more logic at this point. Since the config for 
gpu per task is global it means every task you want will need a certain ratio 
of cpu to gpu.  Since that is a global you can't really have the scenario you 
mentioned, all tasks are assuming to need GPU.  For instance. I request 5 
cores, 2 GPUs, set 1 gpu per task for each executor.  That means that I could 
only run 2 tasks and 3 cores would be wasted.  The stage/task level 
configuration of resources was removed and is something we can do in a separate 
SPIP. We thought erroring would make it more obvious to the user.  We could 
change this to a warning if everyone thinks that is better but I personally 
like the error until we can implement the per lower level per stage 
configuration. 
Tom
On Thursday, March 21, 2019, 1:45:01 AM PDT, Marco Gaido 
 wrote:  
 
 Thanks for this SPIP.I cannot comment on the docs, but just wanted to 
highlight one thing. In page 5 of the SPIP, when we talk about DRA, I see:
"For instance, if each executor consists 4 CPUs and 2 GPUs, and each task 
requires 1 CPU and 1GPU, then we shall throw an error on application start 
because we shall always have at least 2 idle CPUs per executor"
I am not sure this is a correct behavior. We might have tasks requiring only 
CPU running in parallel as well, hence that may make sense. I'd rather emit a 
WARN or something similar. Anyway we just said we will keep GPU scheduling on 
task level out of scope for the moment, right?
Thanks,Marco
Il giorno gio 21 mar 2019 alle ore 01:26 Xiangrui Meng  ha 
scritto:

Steve, the initial work would focus on GPUs, but we will keep the interfaces 
general to support other accelerators in the future. This was mentioned in the 
SPIP and draft design. 
Imran, you should have comment permission now. Thanks for making a pass! I 
don't think the proposed 3.0 features should block Spark 3.0 release either. It 
is just an estimate of what we could deliver. I will update the doc to make it 
clear.
Felix, it would be great if you can review the updated docs and let us know 
your feedback.
** How about setting a tentative vote closing time to next Tue (Mar 26)?
On Wed, Mar 20, 2019 at 11:01 AM Imran Rashid  wrote:

Thanks for sending the updated docs.  Can you please give everyone the ability 
to comment?  I have some comments, but overall I think this is a good proposal 
and addresses my prior concerns.
My only real concern is that I notice some mention of "must dos" for spark 3.0. 
 I don't want to make any commitment to holding spark 3.0 for parts of this, I 
think that is an entirely separate decision.  However 

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-21 Thread Mark Hamstra
I understand the application-level, static, global nature
of spark.task.accelerator.gpu.count and its similarity to the
existing spark.task.cpus, but to me this feels like extending a weakness of
Spark's scheduler, not building on its strengths. That is because I
consider binding the number of cores for each task to an application
configuration to be far from optimal. This is already far from the desired
behavior when an application is running a wide range of jobs (as in a
generic job-runner style of Spark application), some of which require or
can benefit from multi-core tasks, others of which will just waste the
extra cores allocated to their tasks. Ideally, the number of cores
allocated to tasks would get pushed to an even finer granularity that jobs,
and instead being a per-stage property.

Now, of course, making allocation of general-purpose cores and
domain-specific resources work in this finer-grained fashion is a lot more
work than just trying to extend the existing resource allocation mechanisms
to handle domain-specific resources, but it does feel to me like we should
at least be considering doing that deeper redesign.

On Thu, Mar 21, 2019 at 7:33 AM Tom Graves 
wrote:

> Tthe proposal here is that all your resources are static and the gpu per
> task config is global per application, meaning you ask for a certain amount
> memory, cpu, GPUs for every executor up front just like you do today and
> every executor you get is that size.  This means that both static or
> dynamic allocation still work without explicitly adding more logic at this
> point. Since the config for gpu per task is global it means every task you
> want will need a certain ratio of cpu to gpu.  Since that is a global you
> can't really have the scenario you mentioned, all tasks are assuming to
> need GPU.  For instance. I request 5 cores, 2 GPUs, set 1 gpu per task for
> each executor.  That means that I could only run 2 tasks and 3 cores would
> be wasted.  The stage/task level configuration of resources was removed and
> is something we can do in a separate SPIP.
> We thought erroring would make it more obvious to the user.  We could
> change this to a warning if everyone thinks that is better but I personally
> like the error until we can implement the per lower level per stage
> configuration.
>
> Tom
>
> On Thursday, March 21, 2019, 1:45:01 AM PDT, Marco Gaido <
> marcogaid...@gmail.com> wrote:
>
>
> Thanks for this SPIP.
> I cannot comment on the docs, but just wanted to highlight one thing. In
> page 5 of the SPIP, when we talk about DRA, I see:
>
> "For instance, if each executor consists 4 CPUs and 2 GPUs, and each task
> requires 1 CPU and 1GPU, then we shall throw an error on application start
> because we shall always have at least 2 idle CPUs per executor"
>
> I am not sure this is a correct behavior. We might have tasks requiring
> only CPU running in parallel as well, hence that may make sense. I'd rather
> emit a WARN or something similar. Anyway we just said we will keep GPU
> scheduling on task level out of scope for the moment, right?
>
> Thanks,
> Marco
>
> Il giorno gio 21 mar 2019 alle ore 01:26 Xiangrui Meng <
> m...@databricks.com> ha scritto:
>
> Steve, the initial work would focus on GPUs, but we will keep the
> interfaces general to support other accelerators in the future. This was
> mentioned in the SPIP and draft design.
>
> Imran, you should have comment permission now. Thanks for making a pass! I
> don't think the proposed 3.0 features should block Spark 3.0 release
> either. It is just an estimate of what we could deliver. I will update the
> doc to make it clear.
>
> Felix, it would be great if you can review the updated docs and let us
> know your feedback.
>
> ** How about setting a tentative vote closing time to next Tue (Mar 26)?
>
> On Wed, Mar 20, 2019 at 11:01 AM Imran Rashid 
> wrote:
>
> Thanks for sending the updated docs.  Can you please give everyone the
> ability to comment?  I have some comments, but overall I think this is a
> good proposal and addresses my prior concerns.
>
> My only real concern is that I notice some mention of "must dos" for spark
> 3.0.  I don't want to make any commitment to holding spark 3.0 for parts of
> this, I think that is an entirely separate decision.  However I'm guessing
> this is just a minor wording issue, and you really mean that's a minimal
> set of features you are aiming for, which is reasonable.
>
> On Mon, Mar 18, 2019 at 12:56 PM Xingbo Jiang 
> wrote:
>
> Hi all,
>
> I updated the SPIP doc
> 
> and stories
> ,
> I hope it now contains clear scope of the changes and enough details for
> SPIP vote.
> Please review the updated docs, thanks!
>
> Xiangrui Meng  于2019年3月6日周三 上午8:35写道:
>
> How about letting Xingbo make a major revision to the SPIP doc 

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-21 Thread Tom Graves
 Tthe proposal here is that all your resources are static and the gpu per task 
config is global per application, meaning you ask for a certain amount memory, 
cpu, GPUs for every executor up front just like you do today and every executor 
you get is that size.  This means that both static or dynamic allocation still 
work without explicitly adding more logic at this point. Since the config for 
gpu per task is global it means every task you want will need a certain ratio 
of cpu to gpu.  Since that is a global you can't really have the scenario you 
mentioned, all tasks are assuming to need GPU.  For instance. I request 5 
cores, 2 GPUs, set 1 gpu per task for each executor.  That means that I could 
only run 2 tasks and 3 cores would be wasted.  The stage/task level 
configuration of resources was removed and is something we can do in a separate 
SPIP. We thought erroring would make it more obvious to the user.  We could 
change this to a warning if everyone thinks that is better but I personally 
like the error until we can implement the per lower level per stage 
configuration. 
Tom
On Thursday, March 21, 2019, 1:45:01 AM PDT, Marco Gaido 
 wrote:  
 
 Thanks for this SPIP.I cannot comment on the docs, but just wanted to 
highlight one thing. In page 5 of the SPIP, when we talk about DRA, I see:
"For instance, if each executor consists 4 CPUs and 2 GPUs, and each task 
requires 1 CPU and 1GPU, then we shall throw an error on application start 
because we shall always have at least 2 idle CPUs per executor"
I am not sure this is a correct behavior. We might have tasks requiring only 
CPU running in parallel as well, hence that may make sense. I'd rather emit a 
WARN or something similar. Anyway we just said we will keep GPU scheduling on 
task level out of scope for the moment, right?
Thanks,Marco
Il giorno gio 21 mar 2019 alle ore 01:26 Xiangrui Meng  ha 
scritto:

Steve, the initial work would focus on GPUs, but we will keep the interfaces 
general to support other accelerators in the future. This was mentioned in the 
SPIP and draft design. 
Imran, you should have comment permission now. Thanks for making a pass! I 
don't think the proposed 3.0 features should block Spark 3.0 release either. It 
is just an estimate of what we could deliver. I will update the doc to make it 
clear.
Felix, it would be great if you can review the updated docs and let us know 
your feedback.
** How about setting a tentative vote closing time to next Tue (Mar 26)?
On Wed, Mar 20, 2019 at 11:01 AM Imran Rashid  wrote:

Thanks for sending the updated docs.  Can you please give everyone the ability 
to comment?  I have some comments, but overall I think this is a good proposal 
and addresses my prior concerns.
My only real concern is that I notice some mention of "must dos" for spark 3.0. 
 I don't want to make any commitment to holding spark 3.0 for parts of this, I 
think that is an entirely separate decision.  However I'm guessing this is just 
a minor wording issue, and you really mean that's a minimal set of features you 
are aiming for, which is reasonable.
On Mon, Mar 18, 2019 at 12:56 PM Xingbo Jiang  wrote:

Hi all,
I updated the SPIP doc and stories, I hope it now contains clear scope of the 
changes and enough details for SPIP vote.Please review the updated docs, thanks!
Xiangrui Meng  于2019年3月6日周三 上午8:35写道:

How about letting Xingbo make a major revision to the SPIP doc to make it clear 
what proposed are? I like Felix's suggestion to switch to the new Heilmeier 
template, which helps clarify what are proposed and what are not. Then let's 
review the new SPIP and resume the vote.
On Tue, Mar 5, 2019 at 7:54 AM Imran Rashid  wrote:

OK, I suppose then we are getting bogged down into what a vote on an SPIP means 
then anyway, which I guess we can set aside for now.  With the level of detail 
in this proposal, I feel like there is a reasonable chance I'd still -1 the 
design or implementation.
And the other thing you're implicitly asking the community for is to prioritize 
this feature for continued review and maintenance.  There is already work to be 
done in things like making barrier mode support dynamic allocation 
(SPARK-24942), bugs in failure handling (eg. SPARK-25250), and general 
efficiency of failure handling (eg. SPARK-25341, SPARK-20178).  I'm very 
concerned about getting spread too thin.


But if this is really just a vote on (1) is better gpu support important for 
spark, in some form, in some release? and (2) is it *possible* to do this in a 
safe way?  then I will vote +0.
On Tue, Mar 5, 2019 at 8:25 AM Tom Graves  wrote:

 So to me most of the questions here are implementation/design questions, I've 
had this issue in the past with SPIP's where I expected to have more high level 
design details but was basically told that belongs in the design jira follow 
on. This makes me think we need to revisit what a SPIP really need to contain, 
which should be done in a separate thread.  Note 

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-21 Thread Marco Gaido
Thanks for this SPIP.
I cannot comment on the docs, but just wanted to highlight one thing. In
page 5 of the SPIP, when we talk about DRA, I see:

"For instance, if each executor consists 4 CPUs and 2 GPUs, and each task
requires 1 CPU and 1GPU, then we shall throw an error on application start
because we shall always have at least 2 idle CPUs per executor"

I am not sure this is a correct behavior. We might have tasks requiring
only CPU running in parallel as well, hence that may make sense. I'd rather
emit a WARN or something similar. Anyway we just said we will keep GPU
scheduling on task level out of scope for the moment, right?

Thanks,
Marco

Il giorno gio 21 mar 2019 alle ore 01:26 Xiangrui Meng 
ha scritto:

> Steve, the initial work would focus on GPUs, but we will keep the
> interfaces general to support other accelerators in the future. This was
> mentioned in the SPIP and draft design.
>
> Imran, you should have comment permission now. Thanks for making a pass! I
> don't think the proposed 3.0 features should block Spark 3.0 release
> either. It is just an estimate of what we could deliver. I will update the
> doc to make it clear.
>
> Felix, it would be great if you can review the updated docs and let us
> know your feedback.
>
> ** How about setting a tentative vote closing time to next Tue (Mar 26)?
>
> On Wed, Mar 20, 2019 at 11:01 AM Imran Rashid 
> wrote:
>
>> Thanks for sending the updated docs.  Can you please give everyone the
>> ability to comment?  I have some comments, but overall I think this is a
>> good proposal and addresses my prior concerns.
>>
>> My only real concern is that I notice some mention of "must dos" for
>> spark 3.0.  I don't want to make any commitment to holding spark 3.0 for
>> parts of this, I think that is an entirely separate decision.  However I'm
>> guessing this is just a minor wording issue, and you really mean that's a
>> minimal set of features you are aiming for, which is reasonable.
>>
>> On Mon, Mar 18, 2019 at 12:56 PM Xingbo Jiang 
>> wrote:
>>
>>> Hi all,
>>>
>>> I updated the SPIP doc
>>> 
>>> and stories
>>> ,
>>> I hope it now contains clear scope of the changes and enough details for
>>> SPIP vote.
>>> Please review the updated docs, thanks!
>>>
>>> Xiangrui Meng  于2019年3月6日周三 上午8:35写道:
>>>
 How about letting Xingbo make a major revision to the SPIP doc to make
 it clear what proposed are? I like Felix's suggestion to switch to the new
 Heilmeier template, which helps clarify what are proposed and what are not.
 Then let's review the new SPIP and resume the vote.

 On Tue, Mar 5, 2019 at 7:54 AM Imran Rashid 
 wrote:

> OK, I suppose then we are getting bogged down into what a vote on an
> SPIP means then anyway, which I guess we can set aside for now.  With the
> level of detail in this proposal, I feel like there is a reasonable chance
> I'd still -1 the design or implementation.
>
> And the other thing you're implicitly asking the community for is to
> prioritize this feature for continued review and maintenance.  There is
> already work to be done in things like making barrier mode support dynamic
> allocation (SPARK-24942), bugs in failure handling (eg. SPARK-25250), and
> general efficiency of failure handling (eg. SPARK-25341, SPARK-20178).  
> I'm
> very concerned about getting spread too thin.
>

> But if this is really just a vote on (1) is better gpu support
> important for spark, in some form, in some release? and (2) is it
> *possible* to do this in a safe way?  then I will vote +0.
>
> On Tue, Mar 5, 2019 at 8:25 AM Tom Graves 
> wrote:
>
>> So to me most of the questions here are implementation/design
>> questions, I've had this issue in the past with SPIP's where I expected 
>> to
>> have more high level design details but was basically told that belongs 
>> in
>> the design jira follow on. This makes me think we need to revisit what a
>> SPIP really need to contain, which should be done in a separate thread.
>> Note personally I would be for having more high level details in it.
>> But the way I read our documentation on a SPIP right now that detail
>> is all optional, now maybe we could argue its based on what reviewers
>> request, but really perhaps we should make the wording of that more
>> required.  thoughts?  We should probably separate that discussion if 
>> people
>> want to talk about that.
>>
>> For this SPIP in particular the reason I +1 it is because it came
>> down to 2 questions:
>>
>> 1) do I think spark should support this -> my answer is yes, I think
>> this would improve spark, users have been requesting both better GPUs

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-20 Thread Xiangrui Meng
Steve, the initial work would focus on GPUs, but we will keep the
interfaces general to support other accelerators in the future. This was
mentioned in the SPIP and draft design.

Imran, you should have comment permission now. Thanks for making a pass! I
don't think the proposed 3.0 features should block Spark 3.0 release
either. It is just an estimate of what we could deliver. I will update the
doc to make it clear.

Felix, it would be great if you can review the updated docs and let us know
your feedback.

** How about setting a tentative vote closing time to next Tue (Mar 26)?

On Wed, Mar 20, 2019 at 11:01 AM Imran Rashid  wrote:

> Thanks for sending the updated docs.  Can you please give everyone the
> ability to comment?  I have some comments, but overall I think this is a
> good proposal and addresses my prior concerns.
>
> My only real concern is that I notice some mention of "must dos" for spark
> 3.0.  I don't want to make any commitment to holding spark 3.0 for parts of
> this, I think that is an entirely separate decision.  However I'm guessing
> this is just a minor wording issue, and you really mean that's a minimal
> set of features you are aiming for, which is reasonable.
>
> On Mon, Mar 18, 2019 at 12:56 PM Xingbo Jiang 
> wrote:
>
>> Hi all,
>>
>> I updated the SPIP doc
>> 
>> and stories
>> ,
>> I hope it now contains clear scope of the changes and enough details for
>> SPIP vote.
>> Please review the updated docs, thanks!
>>
>> Xiangrui Meng  于2019年3月6日周三 上午8:35写道:
>>
>>> How about letting Xingbo make a major revision to the SPIP doc to make
>>> it clear what proposed are? I like Felix's suggestion to switch to the new
>>> Heilmeier template, which helps clarify what are proposed and what are not.
>>> Then let's review the new SPIP and resume the vote.
>>>
>>> On Tue, Mar 5, 2019 at 7:54 AM Imran Rashid 
>>> wrote:
>>>
 OK, I suppose then we are getting bogged down into what a vote on an
 SPIP means then anyway, which I guess we can set aside for now.  With the
 level of detail in this proposal, I feel like there is a reasonable chance
 I'd still -1 the design or implementation.

 And the other thing you're implicitly asking the community for is to
 prioritize this feature for continued review and maintenance.  There is
 already work to be done in things like making barrier mode support dynamic
 allocation (SPARK-24942), bugs in failure handling (eg. SPARK-25250), and
 general efficiency of failure handling (eg. SPARK-25341, SPARK-20178).  I'm
 very concerned about getting spread too thin.

>>>
 But if this is really just a vote on (1) is better gpu support
 important for spark, in some form, in some release? and (2) is it
 *possible* to do this in a safe way?  then I will vote +0.

 On Tue, Mar 5, 2019 at 8:25 AM Tom Graves  wrote:

> So to me most of the questions here are implementation/design
> questions, I've had this issue in the past with SPIP's where I expected to
> have more high level design details but was basically told that belongs in
> the design jira follow on. This makes me think we need to revisit what a
> SPIP really need to contain, which should be done in a separate thread.
> Note personally I would be for having more high level details in it.
> But the way I read our documentation on a SPIP right now that detail
> is all optional, now maybe we could argue its based on what reviewers
> request, but really perhaps we should make the wording of that more
> required.  thoughts?  We should probably separate that discussion if 
> people
> want to talk about that.
>
> For this SPIP in particular the reason I +1 it is because it came down
> to 2 questions:
>
> 1) do I think spark should support this -> my answer is yes, I think
> this would improve spark, users have been requesting both better GPUs
> support and support for controlling container requests at a finer
> granularity for a while.  If spark doesn't support this then users may go
> to something else, so I think it we should support it
>
> 2) do I think its possible to design and implement it without causing
> large instabilities?   My opinion here again is yes. I agree with Imran 
> and
> others that the scheduler piece needs to be looked at very closely as we
> have had a lot of issues there and that is why I was asking for more
> details in the design jira:
> https://issues.apache.org/jira/browse/SPARK-27005.  But I do believe
> its possible to do.
>
> If others have reservations on similar questions then I think we
> should resolve here or take the discussion of what a SPIP is to a 
> different
> thread and 

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-20 Thread Imran Rashid
Thanks for sending the updated docs.  Can you please give everyone the
ability to comment?  I have some comments, but overall I think this is a
good proposal and addresses my prior concerns.

My only real concern is that I notice some mention of "must dos" for spark
3.0.  I don't want to make any commitment to holding spark 3.0 for parts of
this, I think that is an entirely separate decision.  However I'm guessing
this is just a minor wording issue, and you really mean that's a minimal
set of features you are aiming for, which is reasonable.

On Mon, Mar 18, 2019 at 12:56 PM Xingbo Jiang  wrote:

> Hi all,
>
> I updated the SPIP doc
> 
> and stories
> ,
> I hope it now contains clear scope of the changes and enough details for
> SPIP vote.
> Please review the updated docs, thanks!
>
> Xiangrui Meng  于2019年3月6日周三 上午8:35写道:
>
>> How about letting Xingbo make a major revision to the SPIP doc to make it
>> clear what proposed are? I like Felix's suggestion to switch to the new
>> Heilmeier template, which helps clarify what are proposed and what are not.
>> Then let's review the new SPIP and resume the vote.
>>
>> On Tue, Mar 5, 2019 at 7:54 AM Imran Rashid  wrote:
>>
>>> OK, I suppose then we are getting bogged down into what a vote on an
>>> SPIP means then anyway, which I guess we can set aside for now.  With the
>>> level of detail in this proposal, I feel like there is a reasonable chance
>>> I'd still -1 the design or implementation.
>>>
>>> And the other thing you're implicitly asking the community for is to
>>> prioritize this feature for continued review and maintenance.  There is
>>> already work to be done in things like making barrier mode support dynamic
>>> allocation (SPARK-24942), bugs in failure handling (eg. SPARK-25250), and
>>> general efficiency of failure handling (eg. SPARK-25341, SPARK-20178).  I'm
>>> very concerned about getting spread too thin.
>>>
>>
>>> But if this is really just a vote on (1) is better gpu support important
>>> for spark, in some form, in some release? and (2) is it *possible* to do
>>> this in a safe way?  then I will vote +0.
>>>
>>> On Tue, Mar 5, 2019 at 8:25 AM Tom Graves  wrote:
>>>
 So to me most of the questions here are implementation/design
 questions, I've had this issue in the past with SPIP's where I expected to
 have more high level design details but was basically told that belongs in
 the design jira follow on. This makes me think we need to revisit what a
 SPIP really need to contain, which should be done in a separate thread.
 Note personally I would be for having more high level details in it.
 But the way I read our documentation on a SPIP right now that detail is
 all optional, now maybe we could argue its based on what reviewers request,
 but really perhaps we should make the wording of that more required.
  thoughts?  We should probably separate that discussion if people want to
 talk about that.

 For this SPIP in particular the reason I +1 it is because it came down
 to 2 questions:

 1) do I think spark should support this -> my answer is yes, I think
 this would improve spark, users have been requesting both better GPUs
 support and support for controlling container requests at a finer
 granularity for a while.  If spark doesn't support this then users may go
 to something else, so I think it we should support it

 2) do I think its possible to design and implement it without causing
 large instabilities?   My opinion here again is yes. I agree with Imran and
 others that the scheduler piece needs to be looked at very closely as we
 have had a lot of issues there and that is why I was asking for more
 details in the design jira:
 https://issues.apache.org/jira/browse/SPARK-27005.  But I do believe
 its possible to do.

 If others have reservations on similar questions then I think we should
 resolve here or take the discussion of what a SPIP is to a different thread
 and then come back to this, thoughts?

 Note there is a high level design for at least the core piece, which is
 what people seem concerned with, already so including it in the SPIP should
 be straight forward.

 Tom

 On Monday, March 4, 2019, 2:52:43 PM CST, Imran Rashid <
 im...@therashids.com> wrote:


 On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng  wrote:

 On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung 
 wrote:

 IMO upfront allocation is less useful. Specifically too expensive for
 large jobs.


 This is also an API/design discussion.


 I agree with Felix -- this is more than just an API question.  It has a
 huge impact on the complexity of what you're 

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-19 Thread Jörn Franke
Also on AWS and probably some more cloud providers 

> Am 19.03.2019 um 19:45 schrieb Steve Loughran :
> 
> 
> you might want to look at the work on FPGA resources; again it should just be 
> a resource available by a scheduler. Key thing is probably just to keep the 
> docs generic
> 
> https://hadoop.apache.org/docs/r3.1.0/hadoop-yarn/hadoop-yarn-site/UsingFPGA.html
> 
> I don't know where you get those FPGAs to play with; the Azure ML stuff looks 
> like the kind of thing to think about though: 
> https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-accelerate-with-fpgas


Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-19 Thread Steve Loughran
you might want to look at the work on FPGA resources; again it should just
be a resource available by a scheduler. Key thing is probably just to keep
the docs generic

https://hadoop.apache.org/docs/r3.1.0/hadoop-yarn/hadoop-yarn-site/UsingFPGA.html

I don't know where you get those FPGAs to play with; the Azure ML stuff
looks like the kind of thing to think about though:
https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-accelerate-with-fpgas

>


Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-19 Thread Xiangrui Meng
Sean, thanks for your input and making a pass on the updated SPIP!

As the next step, how about having a remote meeting to discuss the
remaining topics? I started a doodle poll here
. Due to time constraint, I
suggest limiting the attendees to committers and posting the meeting
summary to JIRA after.

On Tue, Mar 19, 2019 at 10:16 AM Sean Owen  wrote:

> This looks like a great level of detail. The broad strokes look good to me.
>
> I'm happy with just about any story around what to do with Mesos GPU
> support now, but might at least deserve a mention: does the existing
> Mesos config simply become a deprecated alias for the
> spark.executor.accelerator.gpu.count? and no further support is added
> to Mesos? that seems entirely coherent, and if that's agreeable, could
> be worth a line here.
>

I would go with deprecated alias option. But I would defer the decision to
some committer who is willing to shepherd the Mesos sub-project.


>
> I think it could go into Spark 3 but need not block it. This doesn't
> say it does, merely says it's desirable to have it ready for 3.0 if
> possible. That seems like a fine position.
>
> On Mon, Mar 18, 2019 at 1:56 PM Xingbo Jiang 
> wrote:
> >
> > Hi all,
> >
> > I updated the SPIP doc and stories, I hope it now contains clear scope
> of the changes and enough details for SPIP vote.
> > Please review the updated docs, thanks!
> >
> > Xiangrui Meng  于2019年3月6日周三 上午8:35写道:
> >>
> >> How about letting Xingbo make a major revision to the SPIP doc to make
> it clear what proposed are? I like Felix's suggestion to switch to the new
> Heilmeier template, which helps clarify what are proposed and what are not.
> Then let's review the new SPIP and resume the vote.
> >>
> >> On Tue, Mar 5, 2019 at 7:54 AM Imran Rashid 
> wrote:
> >>>
> >>> OK, I suppose then we are getting bogged down into what a vote on an
> SPIP means then anyway, which I guess we can set aside for now.  With the
> level of detail in this proposal, I feel like there is a reasonable chance
> I'd still -1 the design or implementation.
> >>>
> >>> And the other thing you're implicitly asking the community for is to
> prioritize this feature for continued review and maintenance.  There is
> already work to be done in things like making barrier mode support dynamic
> allocation (SPARK-24942), bugs in failure handling (eg. SPARK-25250), and
> general efficiency of failure handling (eg. SPARK-25341, SPARK-20178).  I'm
> very concerned about getting spread too thin.
> >>>
> >>>
> >>> But if this is really just a vote on (1) is better gpu support
> important for spark, in some form, in some release? and (2) is it
> *possible* to do this in a safe way?  then I will vote +0.
> >>>
> >>> On Tue, Mar 5, 2019 at 8:25 AM Tom Graves 
> wrote:
> 
>  So to me most of the questions here are implementation/design
> questions, I've had this issue in the past with SPIP's where I expected to
> have more high level design details but was basically told that belongs in
> the design jira follow on. This makes me think we need to revisit what a
> SPIP really need to contain, which should be done in a separate thread.
> Note personally I would be for having more high level details in it.
>  But the way I read our documentation on a SPIP right now that detail
> is all optional, now maybe we could argue its based on what reviewers
> request, but really perhaps we should make the wording of that more
> required.  thoughts?  We should probably separate that discussion if people
> want to talk about that.
> 
>  For this SPIP in particular the reason I +1 it is because it came
> down to 2 questions:
> 
>  1) do I think spark should support this -> my answer is yes, I think
> this would improve spark, users have been requesting both better GPUs
> support and support for controlling container requests at a finer
> granularity for a while.  If spark doesn't support this then users may go
> to something else, so I think it we should support it
> 
>  2) do I think its possible to design and implement it without causing
> large instabilities?   My opinion here again is yes. I agree with Imran and
> others that the scheduler piece needs to be looked at very closely as we
> have had a lot of issues there and that is why I was asking for more
> details in the design jira:
> https://issues.apache.org/jira/browse/SPARK-27005.  But I do believe its
> possible to do.
> 
>  If others have reservations on similar questions then I think we
> should resolve here or take the discussion of what a SPIP is to a different
> thread and then come back to this, thoughts?
> 
>  Note there is a high level design for at least the core piece, which
> is what people seem concerned with, already so including it in the SPIP
> should be straight forward.
> 
>  Tom
> 
>  On Monday, March 4, 2019, 2:52:43 PM CST, Imran Rashid <
> im...@therashids.com> wrote:
> 
> 

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-19 Thread Sean Owen
This looks like a great level of detail. The broad strokes look good to me.

I'm happy with just about any story around what to do with Mesos GPU
support now, but might at least deserve a mention: does the existing
Mesos config simply become a deprecated alias for the
spark.executor.accelerator.gpu.count? and no further support is added
to Mesos? that seems entirely coherent, and if that's agreeable, could
be worth a line here.

I think it could go into Spark 3 but need not block it. This doesn't
say it does, merely says it's desirable to have it ready for 3.0 if
possible. That seems like a fine position.

On Mon, Mar 18, 2019 at 1:56 PM Xingbo Jiang  wrote:
>
> Hi all,
>
> I updated the SPIP doc and stories, I hope it now contains clear scope of the 
> changes and enough details for SPIP vote.
> Please review the updated docs, thanks!
>
> Xiangrui Meng  于2019年3月6日周三 上午8:35写道:
>>
>> How about letting Xingbo make a major revision to the SPIP doc to make it 
>> clear what proposed are? I like Felix's suggestion to switch to the new 
>> Heilmeier template, which helps clarify what are proposed and what are not. 
>> Then let's review the new SPIP and resume the vote.
>>
>> On Tue, Mar 5, 2019 at 7:54 AM Imran Rashid  wrote:
>>>
>>> OK, I suppose then we are getting bogged down into what a vote on an SPIP 
>>> means then anyway, which I guess we can set aside for now.  With the level 
>>> of detail in this proposal, I feel like there is a reasonable chance I'd 
>>> still -1 the design or implementation.
>>>
>>> And the other thing you're implicitly asking the community for is to 
>>> prioritize this feature for continued review and maintenance.  There is 
>>> already work to be done in things like making barrier mode support dynamic 
>>> allocation (SPARK-24942), bugs in failure handling (eg. SPARK-25250), and 
>>> general efficiency of failure handling (eg. SPARK-25341, SPARK-20178).  I'm 
>>> very concerned about getting spread too thin.
>>>
>>>
>>> But if this is really just a vote on (1) is better gpu support important 
>>> for spark, in some form, in some release? and (2) is it *possible* to do 
>>> this in a safe way?  then I will vote +0.
>>>
>>> On Tue, Mar 5, 2019 at 8:25 AM Tom Graves  wrote:

 So to me most of the questions here are implementation/design questions, 
 I've had this issue in the past with SPIP's where I expected to have more 
 high level design details but was basically told that belongs in the 
 design jira follow on. This makes me think we need to revisit what a SPIP 
 really need to contain, which should be done in a separate thread.  Note 
 personally I would be for having more high level details in it.
 But the way I read our documentation on a SPIP right now that detail is 
 all optional, now maybe we could argue its based on what reviewers 
 request, but really perhaps we should make the wording of that more 
 required.  thoughts?  We should probably separate that discussion if 
 people want to talk about that.

 For this SPIP in particular the reason I +1 it is because it came down to 
 2 questions:

 1) do I think spark should support this -> my answer is yes, I think this 
 would improve spark, users have been requesting both better GPUs support 
 and support for controlling container requests at a finer granularity for 
 a while.  If spark doesn't support this then users may go to something 
 else, so I think it we should support it

 2) do I think its possible to design and implement it without causing 
 large instabilities?   My opinion here again is yes. I agree with Imran 
 and others that the scheduler piece needs to be looked at very closely as 
 we have had a lot of issues there and that is why I was asking for more 
 details in the design jira:  
 https://issues.apache.org/jira/browse/SPARK-27005.  But I do believe its 
 possible to do.

 If others have reservations on similar questions then I think we should 
 resolve here or take the discussion of what a SPIP is to a different 
 thread and then come back to this, thoughts?

 Note there is a high level design for at least the core piece, which is 
 what people seem concerned with, already so including it in the SPIP 
 should be straight forward.

 Tom

 On Monday, March 4, 2019, 2:52:43 PM CST, Imran Rashid 
  wrote:


 On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng  wrote:

 On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung  
 wrote:

 IMO upfront allocation is less useful. Specifically too expensive for 
 large jobs.


 This is also an API/design discussion.


 I agree with Felix -- this is more than just an API question.  It has a 
 huge impact on the complexity of what you're proposing.  You might be 
 proposing big changes to a core and brittle part of spark, which is 
 already 

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-18 Thread Xingbo Jiang
Hi all,

I updated the SPIP doc

and stories
,
I hope it now contains clear scope of the changes and enough details for
SPIP vote.
Please review the updated docs, thanks!

Xiangrui Meng  于2019年3月6日周三 上午8:35写道:

> How about letting Xingbo make a major revision to the SPIP doc to make it
> clear what proposed are? I like Felix's suggestion to switch to the new
> Heilmeier template, which helps clarify what are proposed and what are not.
> Then let's review the new SPIP and resume the vote.
>
> On Tue, Mar 5, 2019 at 7:54 AM Imran Rashid  wrote:
>
>> OK, I suppose then we are getting bogged down into what a vote on an SPIP
>> means then anyway, which I guess we can set aside for now.  With the level
>> of detail in this proposal, I feel like there is a reasonable chance I'd
>> still -1 the design or implementation.
>>
>> And the other thing you're implicitly asking the community for is to
>> prioritize this feature for continued review and maintenance.  There is
>> already work to be done in things like making barrier mode support dynamic
>> allocation (SPARK-24942), bugs in failure handling (eg. SPARK-25250), and
>> general efficiency of failure handling (eg. SPARK-25341, SPARK-20178).  I'm
>> very concerned about getting spread too thin.
>>
>
>> But if this is really just a vote on (1) is better gpu support important
>> for spark, in some form, in some release? and (2) is it *possible* to do
>> this in a safe way?  then I will vote +0.
>>
>> On Tue, Mar 5, 2019 at 8:25 AM Tom Graves  wrote:
>>
>>> So to me most of the questions here are implementation/design questions,
>>> I've had this issue in the past with SPIP's where I expected to have more
>>> high level design details but was basically told that belongs in the design
>>> jira follow on. This makes me think we need to revisit what a SPIP really
>>> need to contain, which should be done in a separate thread.  Note
>>> personally I would be for having more high level details in it.
>>> But the way I read our documentation on a SPIP right now that detail is
>>> all optional, now maybe we could argue its based on what reviewers request,
>>> but really perhaps we should make the wording of that more required.
>>>  thoughts?  We should probably separate that discussion if people want to
>>> talk about that.
>>>
>>> For this SPIP in particular the reason I +1 it is because it came down
>>> to 2 questions:
>>>
>>> 1) do I think spark should support this -> my answer is yes, I think
>>> this would improve spark, users have been requesting both better GPUs
>>> support and support for controlling container requests at a finer
>>> granularity for a while.  If spark doesn't support this then users may go
>>> to something else, so I think it we should support it
>>>
>>> 2) do I think its possible to design and implement it without causing
>>> large instabilities?   My opinion here again is yes. I agree with Imran and
>>> others that the scheduler piece needs to be looked at very closely as we
>>> have had a lot of issues there and that is why I was asking for more
>>> details in the design jira:
>>> https://issues.apache.org/jira/browse/SPARK-27005.  But I do believe
>>> its possible to do.
>>>
>>> If others have reservations on similar questions then I think we should
>>> resolve here or take the discussion of what a SPIP is to a different thread
>>> and then come back to this, thoughts?
>>>
>>> Note there is a high level design for at least the core piece, which is
>>> what people seem concerned with, already so including it in the SPIP should
>>> be straight forward.
>>>
>>> Tom
>>>
>>> On Monday, March 4, 2019, 2:52:43 PM CST, Imran Rashid <
>>> im...@therashids.com> wrote:
>>>
>>>
>>> On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng  wrote:
>>>
>>> On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung 
>>> wrote:
>>>
>>> IMO upfront allocation is less useful. Specifically too expensive for
>>> large jobs.
>>>
>>>
>>> This is also an API/design discussion.
>>>
>>>
>>> I agree with Felix -- this is more than just an API question.  It has a
>>> huge impact on the complexity of what you're proposing.  You might be
>>> proposing big changes to a core and brittle part of spark, which is already
>>> short of experts.
>>>
>>> I don't see any value in having a vote on "does feature X sound cool?"
>>> We have to evaluate the potential benefit against the risks the feature
>>> brings and the continued maintenance cost.  We don't need super low-level
>>> details, but we have to a sketch of the design to be able to make that
>>> tradeoff.
>>>
>>


Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-05 Thread Xiangrui Meng
How about letting Xingbo make a major revision to the SPIP doc to make it
clear what proposed are? I like Felix's suggestion to switch to the new
Heilmeier template, which helps clarify what are proposed and what are not.
Then let's review the new SPIP and resume the vote.

On Tue, Mar 5, 2019 at 7:54 AM Imran Rashid  wrote:

> OK, I suppose then we are getting bogged down into what a vote on an SPIP
> means then anyway, which I guess we can set aside for now.  With the level
> of detail in this proposal, I feel like there is a reasonable chance I'd
> still -1 the design or implementation.
>
> And the other thing you're implicitly asking the community for is to
> prioritize this feature for continued review and maintenance.  There is
> already work to be done in things like making barrier mode support dynamic
> allocation (SPARK-24942), bugs in failure handling (eg. SPARK-25250), and
> general efficiency of failure handling (eg. SPARK-25341, SPARK-20178).  I'm
> very concerned about getting spread too thin.
>

> But if this is really just a vote on (1) is better gpu support important
> for spark, in some form, in some release? and (2) is it *possible* to do
> this in a safe way?  then I will vote +0.
>
> On Tue, Mar 5, 2019 at 8:25 AM Tom Graves  wrote:
>
>> So to me most of the questions here are implementation/design questions,
>> I've had this issue in the past with SPIP's where I expected to have more
>> high level design details but was basically told that belongs in the design
>> jira follow on. This makes me think we need to revisit what a SPIP really
>> need to contain, which should be done in a separate thread.  Note
>> personally I would be for having more high level details in it.
>> But the way I read our documentation on a SPIP right now that detail is
>> all optional, now maybe we could argue its based on what reviewers request,
>> but really perhaps we should make the wording of that more required.
>>  thoughts?  We should probably separate that discussion if people want to
>> talk about that.
>>
>> For this SPIP in particular the reason I +1 it is because it came down to
>> 2 questions:
>>
>> 1) do I think spark should support this -> my answer is yes, I think this
>> would improve spark, users have been requesting both better GPUs support
>> and support for controlling container requests at a finer granularity for a
>> while.  If spark doesn't support this then users may go to something else,
>> so I think it we should support it
>>
>> 2) do I think its possible to design and implement it without causing
>> large instabilities?   My opinion here again is yes. I agree with Imran and
>> others that the scheduler piece needs to be looked at very closely as we
>> have had a lot of issues there and that is why I was asking for more
>> details in the design jira:
>> https://issues.apache.org/jira/browse/SPARK-27005.  But I do believe its
>> possible to do.
>>
>> If others have reservations on similar questions then I think we should
>> resolve here or take the discussion of what a SPIP is to a different thread
>> and then come back to this, thoughts?
>>
>> Note there is a high level design for at least the core piece, which is
>> what people seem concerned with, already so including it in the SPIP should
>> be straight forward.
>>
>> Tom
>>
>> On Monday, March 4, 2019, 2:52:43 PM CST, Imran Rashid <
>> im...@therashids.com> wrote:
>>
>>
>> On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng  wrote:
>>
>> On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung 
>> wrote:
>>
>> IMO upfront allocation is less useful. Specifically too expensive for
>> large jobs.
>>
>>
>> This is also an API/design discussion.
>>
>>
>> I agree with Felix -- this is more than just an API question.  It has a
>> huge impact on the complexity of what you're proposing.  You might be
>> proposing big changes to a core and brittle part of spark, which is already
>> short of experts.
>>
>> I don't see any value in having a vote on "does feature X sound cool?"
>> We have to evaluate the potential benefit against the risks the feature
>> brings and the continued maintenance cost.  We don't need super low-level
>> details, but we have to a sketch of the design to be able to make that
>> tradeoff.
>>
>


Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-05 Thread Imran Rashid
OK, I suppose then we are getting bogged down into what a vote on an SPIP
means then anyway, which I guess we can set aside for now.  With the level
of detail in this proposal, I feel like there is a reasonable chance I'd
still -1 the design or implementation.

And the other thing you're implicitly asking the community for is to
prioritize this feature for continued review and maintenance.  There is
already work to be done in things like making barrier mode support dynamic
allocation (SPARK-24942), bugs in failure handling (eg. SPARK-25250), and
general efficiency of failure handling (eg. SPARK-25341, SPARK-20178).  I'm
very concerned about getting spread too thin.

But if this is really just a vote on (1) is better gpu support important
for spark, in some form, in some release? and (2) is it *possible* to do
this in a safe way?  then I will vote +0.

On Tue, Mar 5, 2019 at 8:25 AM Tom Graves  wrote:

> So to me most of the questions here are implementation/design questions,
> I've had this issue in the past with SPIP's where I expected to have more
> high level design details but was basically told that belongs in the design
> jira follow on. This makes me think we need to revisit what a SPIP really
> need to contain, which should be done in a separate thread.  Note
> personally I would be for having more high level details in it.
> But the way I read our documentation on a SPIP right now that detail is
> all optional, now maybe we could argue its based on what reviewers request,
> but really perhaps we should make the wording of that more required.
>  thoughts?  We should probably separate that discussion if people want to
> talk about that.
>
> For this SPIP in particular the reason I +1 it is because it came down to
> 2 questions:
>
> 1) do I think spark should support this -> my answer is yes, I think this
> would improve spark, users have been requesting both better GPUs support
> and support for controlling container requests at a finer granularity for a
> while.  If spark doesn't support this then users may go to something else,
> so I think it we should support it
>
> 2) do I think its possible to design and implement it without causing
> large instabilities?   My opinion here again is yes. I agree with Imran and
> others that the scheduler piece needs to be looked at very closely as we
> have had a lot of issues there and that is why I was asking for more
> details in the design jira:
> https://issues.apache.org/jira/browse/SPARK-27005.  But I do believe its
> possible to do.
>
> If others have reservations on similar questions then I think we should
> resolve here or take the discussion of what a SPIP is to a different thread
> and then come back to this, thoughts?
>
> Note there is a high level design for at least the core piece, which is
> what people seem concerned with, already so including it in the SPIP should
> be straight forward.
>
> Tom
>
> On Monday, March 4, 2019, 2:52:43 PM CST, Imran Rashid <
> im...@therashids.com> wrote:
>
>
> On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng  wrote:
>
> On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung 
> wrote:
>
> IMO upfront allocation is less useful. Specifically too expensive for
> large jobs.
>
>
> This is also an API/design discussion.
>
>
> I agree with Felix -- this is more than just an API question.  It has a
> huge impact on the complexity of what you're proposing.  You might be
> proposing big changes to a core and brittle part of spark, which is already
> short of experts.
>
> I don't see any value in having a vote on "does feature X sound cool?"  We
> have to evaluate the potential benefit against the risks the feature brings
> and the continued maintenance cost.  We don't need super low-level details,
> but we have to a sketch of the design to be able to make that tradeoff.
>


Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-05 Thread Tom Graves
 So to me most of the questions here are implementation/design questions, I've 
had this issue in the past with SPIP's where I expected to have more high level 
design details but was basically told that belongs in the design jira follow 
on. This makes me think we need to revisit what a SPIP really need to contain, 
which should be done in a separate thread.  Note personally I would be for 
having more high level details in it.But the way I read our documentation on a 
SPIP right now that detail is all optional, now maybe we could argue its based 
on what reviewers request, but really perhaps we should make the wording of 
that more required.  thoughts?  We should probably separate that discussion if 
people want to talk about that.
For this SPIP in particular the reason I +1 it is because it came down to 2 
questions:
1) do I think spark should support this -> my answer is yes, I think this would 
improve spark, users have been requesting both better GPUs support and support 
for controlling container requests at a finer granularity for a while.  If 
spark doesn't support this then users may go to something else, so I think it 
we should support it
2) do I think its possible to design and implement it without causing large 
instabilities?   My opinion here again is yes. I agree with Imran and others 
that the scheduler piece needs to be looked at very closely as we have had a 
lot of issues there and that is why I was asking for more details in the design 
jira:  https://issues.apache.org/jira/browse/SPARK-27005.  But I do believe its 
possible to do.
If others have reservations on similar questions then I think we should resolve 
here or take the discussion of what a SPIP is to a different thread and then 
come back to this, thoughts?    
Note there is a high level design for at least the core piece, which is what 
people seem concerned with, already so including it in the SPIP should be 
straight forward.
Tom
On Monday, March 4, 2019, 2:52:43 PM CST, Imran Rashid 
 wrote:  
 
 On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng  wrote:

On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung  wrote:
IMO upfront allocation is less useful. Specifically too expensive for large 
jobs.

This is also an API/design discussion.

I agree with Felix -- this is more than just an API question.  It has a huge 
impact on the complexity of what you're proposing.  You might be proposing big 
changes to a core and brittle part of spark, which is already short of experts.
I don't see any value in having a vote on "does feature X sound cool?"  We have 
to evaluate the potential benefit against the risks the feature brings and the 
continued maintenance cost.  We don't need super low-level details, but we have 
to a sketch of the design to be able to make that tradeoff.  

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-04 Thread Mark Hamstra
I'll try to find some time, but it's really at a premium right now.

On Mon, Mar 4, 2019 at 3:17 PM Xiangrui Meng  wrote:

>
>
> On Mon, Mar 4, 2019 at 3:10 PM Mark Hamstra 
> wrote:
>
>> :) Sorry, that was ambiguous. I was seconding Imran's comment.
>>
>
> Could you also help review Xingbo's design sketch and help evaluate the
> cost?
>
>
>>
>> On Mon, Mar 4, 2019 at 3:09 PM Xiangrui Meng  wrote:
>>
>>>
>>>
>>> On Mon, Mar 4, 2019 at 1:56 PM Mark Hamstra 
>>> wrote:
>>>
 +1

>>>
>>> Mark, just to be clear, are you +1 on the SPIP or Imran's point?
>>>
>>>

 On Mon, Mar 4, 2019 at 12:52 PM Imran Rashid 
 wrote:

> On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng  wrote:
>
>> On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung <
>> felixcheun...@hotmail.com> wrote:
>>
>>> IMO upfront allocation is less useful. Specifically too expensive
>>> for large jobs.
>>>
>>
>> This is also an API/design discussion.
>>
>
> I agree with Felix -- this is more than just an API question.  It has
> a huge impact on the complexity of what you're proposing.  You might be
> proposing big changes to a core and brittle part of spark, which is 
> already
> short of experts.
>

>>> To my understanding, Felix's comment is mostly on the user interfaces,
>>> stating upfront allocation is less useful, specially for large jobs. I
>>> agree that for large jobs we better have dynamic allocation, which was
>>> mentioned in the YARN support section in the companion scoping doc. We
>>> restrict the new container type to initially requested to keep things
>>> simple. However upfront allocation already meets the requirements of basic
>>> workflows like data + DL training/inference + data. Saying "it is less
>>> useful specifically for large jobs" kinda missed the fact that "it is super
>>> useful for basic use cases".
>>>
>>> Your comment is mostly on the implementation side, which IMHO it is the
>>> KEY question to conclude this vote: does the design sketch sufficiently
>>> demonstrate that the internal changes to Spark scheduler is manageable? I
>>> read Xingbo's design sketch and I think it is doable, which led to my +1.
>>> But I'm not an expert on the scheduler. So I would feel more confident if
>>> the design was reviewed by some scheduler experts. I also read the design
>>> sketch to support different cluster managers, which I think is less
>>> critical than the internal scheduler changes.
>>>
>>>

> I don't see any value in having a vote on "does feature X sound cool?"
>

>>> I believe no one would disagree. To prepare the companion doc, we went
>>> through several rounds of discussions to provide concrete stories such that
>>> the proposal is not just "cool".
>>>
>>>

>
 We have to evaluate the potential benefit against the risks the feature
> brings and the continued maintenance cost.  We don't need super low-level
> details, but we have to a sketch of the design to be able to make that
> tradeoff.
>

>>> Could you review the design sketch from Xingbo, help evaluate the cost,
>>> and provide feedback?
>>>
>>>
>>


Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-04 Thread Mark Hamstra
:) Sorry, that was ambiguous. I was seconding Imran's comment.

On Mon, Mar 4, 2019 at 3:09 PM Xiangrui Meng  wrote:

>
>
> On Mon, Mar 4, 2019 at 1:56 PM Mark Hamstra 
> wrote:
>
>> +1
>>
>
> Mark, just to be clear, are you +1 on the SPIP or Imran's point?
>
>
>>
>> On Mon, Mar 4, 2019 at 12:52 PM Imran Rashid 
>> wrote:
>>
>>> On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng  wrote:
>>>
 On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung 
 wrote:

> IMO upfront allocation is less useful. Specifically too expensive for
> large jobs.
>

 This is also an API/design discussion.

>>>
>>> I agree with Felix -- this is more than just an API question.  It has a
>>> huge impact on the complexity of what you're proposing.  You might be
>>> proposing big changes to a core and brittle part of spark, which is already
>>> short of experts.
>>>
>>
> To my understanding, Felix's comment is mostly on the user interfaces,
> stating upfront allocation is less useful, specially for large jobs. I
> agree that for large jobs we better have dynamic allocation, which was
> mentioned in the YARN support section in the companion scoping doc. We
> restrict the new container type to initially requested to keep things
> simple. However upfront allocation already meets the requirements of basic
> workflows like data + DL training/inference + data. Saying "it is less
> useful specifically for large jobs" kinda missed the fact that "it is super
> useful for basic use cases".
>
> Your comment is mostly on the implementation side, which IMHO it is the
> KEY question to conclude this vote: does the design sketch sufficiently
> demonstrate that the internal changes to Spark scheduler is manageable? I
> read Xingbo's design sketch and I think it is doable, which led to my +1.
> But I'm not an expert on the scheduler. So I would feel more confident if
> the design was reviewed by some scheduler experts. I also read the design
> sketch to support different cluster managers, which I think is less
> critical than the internal scheduler changes.
>
>
>>
>>> I don't see any value in having a vote on "does feature X sound cool?"
>>>
>>
> I believe no one would disagree. To prepare the companion doc, we went
> through several rounds of discussions to provide concrete stories such that
> the proposal is not just "cool".
>
>
>>
>>>
>> We have to evaluate the potential benefit against the risks the feature
>>> brings and the continued maintenance cost.  We don't need super low-level
>>> details, but we have to a sketch of the design to be able to make that
>>> tradeoff.
>>>
>>
> Could you review the design sketch from Xingbo, help evaluate the cost,
> and provide feedback?
>
>


Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-04 Thread Xiangrui Meng
On Mon, Mar 4, 2019 at 3:10 PM Mark Hamstra  wrote:

> :) Sorry, that was ambiguous. I was seconding Imran's comment.
>

Could you also help review Xingbo's design sketch and help evaluate the
cost?


>
> On Mon, Mar 4, 2019 at 3:09 PM Xiangrui Meng  wrote:
>
>>
>>
>> On Mon, Mar 4, 2019 at 1:56 PM Mark Hamstra 
>> wrote:
>>
>>> +1
>>>
>>
>> Mark, just to be clear, are you +1 on the SPIP or Imran's point?
>>
>>
>>>
>>> On Mon, Mar 4, 2019 at 12:52 PM Imran Rashid 
>>> wrote:
>>>
 On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng  wrote:

> On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung <
> felixcheun...@hotmail.com> wrote:
>
>> IMO upfront allocation is less useful. Specifically too expensive for
>> large jobs.
>>
>
> This is also an API/design discussion.
>

 I agree with Felix -- this is more than just an API question.  It has a
 huge impact on the complexity of what you're proposing.  You might be
 proposing big changes to a core and brittle part of spark, which is already
 short of experts.

>>>
>> To my understanding, Felix's comment is mostly on the user interfaces,
>> stating upfront allocation is less useful, specially for large jobs. I
>> agree that for large jobs we better have dynamic allocation, which was
>> mentioned in the YARN support section in the companion scoping doc. We
>> restrict the new container type to initially requested to keep things
>> simple. However upfront allocation already meets the requirements of basic
>> workflows like data + DL training/inference + data. Saying "it is less
>> useful specifically for large jobs" kinda missed the fact that "it is super
>> useful for basic use cases".
>>
>> Your comment is mostly on the implementation side, which IMHO it is the
>> KEY question to conclude this vote: does the design sketch sufficiently
>> demonstrate that the internal changes to Spark scheduler is manageable? I
>> read Xingbo's design sketch and I think it is doable, which led to my +1.
>> But I'm not an expert on the scheduler. So I would feel more confident if
>> the design was reviewed by some scheduler experts. I also read the design
>> sketch to support different cluster managers, which I think is less
>> critical than the internal scheduler changes.
>>
>>
>>>
 I don't see any value in having a vote on "does feature X sound cool?"

>>>
>> I believe no one would disagree. To prepare the companion doc, we went
>> through several rounds of discussions to provide concrete stories such that
>> the proposal is not just "cool".
>>
>>
>>>

>>> We have to evaluate the potential benefit against the risks the feature
 brings and the continued maintenance cost.  We don't need super low-level
 details, but we have to a sketch of the design to be able to make that
 tradeoff.

>>>
>> Could you review the design sketch from Xingbo, help evaluate the cost,
>> and provide feedback?
>>
>>
>


Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-04 Thread Xiangrui Meng
On Mon, Mar 4, 2019 at 1:56 PM Mark Hamstra  wrote:

> +1
>

Mark, just to be clear, are you +1 on the SPIP or Imran's point?


>
> On Mon, Mar 4, 2019 at 12:52 PM Imran Rashid  wrote:
>
>> On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng  wrote:
>>
>>> On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung 
>>> wrote:
>>>
 IMO upfront allocation is less useful. Specifically too expensive for
 large jobs.

>>>
>>> This is also an API/design discussion.
>>>
>>
>> I agree with Felix -- this is more than just an API question.  It has a
>> huge impact on the complexity of what you're proposing.  You might be
>> proposing big changes to a core and brittle part of spark, which is already
>> short of experts.
>>
>
To my understanding, Felix's comment is mostly on the user interfaces,
stating upfront allocation is less useful, specially for large jobs. I
agree that for large jobs we better have dynamic allocation, which was
mentioned in the YARN support section in the companion scoping doc. We
restrict the new container type to initially requested to keep things
simple. However upfront allocation already meets the requirements of basic
workflows like data + DL training/inference + data. Saying "it is less
useful specifically for large jobs" kinda missed the fact that "it is super
useful for basic use cases".

Your comment is mostly on the implementation side, which IMHO it is the KEY
question to conclude this vote: does the design sketch sufficiently
demonstrate that the internal changes to Spark scheduler is manageable? I
read Xingbo's design sketch and I think it is doable, which led to my +1.
But I'm not an expert on the scheduler. So I would feel more confident if
the design was reviewed by some scheduler experts. I also read the design
sketch to support different cluster managers, which I think is less
critical than the internal scheduler changes.


>
>> I don't see any value in having a vote on "does feature X sound cool?"
>>
>
I believe no one would disagree. To prepare the companion doc, we went
through several rounds of discussions to provide concrete stories such that
the proposal is not just "cool".


>
>>
> We have to evaluate the potential benefit against the risks the feature
>> brings and the continued maintenance cost.  We don't need super low-level
>> details, but we have to a sketch of the design to be able to make that
>> tradeoff.
>>
>
Could you review the design sketch from Xingbo, help evaluate the cost, and
provide feedback?


Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-04 Thread Mark Hamstra
+1

On Mon, Mar 4, 2019 at 12:52 PM Imran Rashid  wrote:

> On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng  wrote:
>
>> On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung 
>> wrote:
>>
>>> IMO upfront allocation is less useful. Specifically too expensive for
>>> large jobs.
>>>
>>
>> This is also an API/design discussion.
>>
>
> I agree with Felix -- this is more than just an API question.  It has a
> huge impact on the complexity of what you're proposing.  You might be
> proposing big changes to a core and brittle part of spark, which is already
> short of experts.
>
> I don't see any value in having a vote on "does feature X sound cool?"  We
> have to evaluate the potential benefit against the risks the feature brings
> and the continued maintenance cost.  We don't need super low-level details,
> but we have to a sketch of the design to be able to make that tradeoff.
>


Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-04 Thread Imran Rashid
On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng  wrote:

> On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung 
> wrote:
>
>> IMO upfront allocation is less useful. Specifically too expensive for
>> large jobs.
>>
>
> This is also an API/design discussion.
>

I agree with Felix -- this is more than just an API question.  It has a
huge impact on the complexity of what you're proposing.  You might be
proposing big changes to a core and brittle part of spark, which is already
short of experts.

I don't see any value in having a vote on "does feature X sound cool?"  We
have to evaluate the potential benefit against the risks the feature brings
and the continued maintenance cost.  We don't need super low-level details,
but we have to a sketch of the design to be able to make that tradeoff.


Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-04 Thread Sean Owen
It sounds like there's a discussion about the details coming, which is
fine and good. That should maybe also have a VOTE. The debate here is
then merely about what and when to call things a SPIP, but that's not
important.

On Mon, Mar 4, 2019 at 10:23 AM Xiangrui Meng  wrote:
> I think the two requires more discussion are Mesos and K8s. Let me follow 
> what I suggested above and try to answer two questions for each:
>
> Mesos:
> * Is it important? There are certainly Spark/Mesos users but the overall 
> usage is going downhill. See the attached Google Trend snapshot.
> * How to implement it? I believe it is doable, similarly to other cluster 
> managers. However, we need to find someone from our community to do the work. 
> If we cannot find such a person, it would indicate that the feature is not 
> that important.

I don't think that was the issue that was raised; I don't advocate for
investing more in supporting this cluster manager, myself.
The issue was that we _already_ have support for allocating GPUs in
Mesos. Whatever limited support is there, presumably, doesn't get
removed. It merely needs to be attached to whatever new mechanisms are
implemented. I only pushed back on the idea that it should be ignored
and (presumably) left as a separate unrelated implementation.

> You see that such discussions can be done in parallel. It is not efficient if 
> we block the work on K8s because we cannot decide whether we should support 
> Mesos.

Is the question blocking anything? An answer is: let's say we just
make whatever support in Mesos exists still works coherently with the
new mechanism, whatever those details may be. Is there any
disagreement on that out there? I agree with you in that I think it
shouldn't have been ruled out at this stage, per earlier comments.
This doesn't seem hard to answer as a question of scope even now.

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-04 Thread Xiangrui Meng
On Mon, Mar 4, 2019 at 8:23 AM Xiangrui Meng  wrote:

>
>
> On Mon, Mar 4, 2019 at 7:24 AM Sean Owen  wrote:
>
>> To be clear, those goals sound fine to me. I don't think voting on
>> those two broad points is meaningful, but, does no harm per se. If you
>> mean this is just a check to see if people believe this is broadly
>> worthwhile, then +1 from me. Yes it is.
>>
>> That means we'd want to review something more detailed later, whether
>> it's a a) design doc we vote on or b) a series of pull requests. Given
>> the number of questions this leaves open, a) sounds better and I think
>> what you're suggesting. I'd call that the SPIP, but, so what, it's
>> just a name. The thing is, a) seems already mostly done, in the second
>> document that was attached.
>
>
> It is far from done. We still need to review the APIs and the design for
> each major component:
>
> * Internal changes to Spark job scheduler.
> * Interfaces exposed to users.
> * Interfaces exposed to cluster managers.
> * Standalone / auto-discovery.
> * YARN
> * K8s
> * Mesos
> * Jenkins
>
> I try to avoid discussing each of them in this thread because they require
> different domain experts. After we have a high-level agreement on adding
> accelerator support to Spark. We can kick off the work in parallel. If any
> committer thinks a follow-up work still needs an SPIP, we just follow the
> SPIP process to resolve it.
>
>
>> I'm hesitating because i'm not sure why
>> it's important to not discuss that level of detail here, as it's
>> already available. Just too much noise?
>
>
> Yes. If we go down one or two levels, we might have to pull in different
> domain experts for different questions.
>
>
>> but voting for this seems like
>> endorsing those decisions, as I can only assume the proposer is going
>> to continue the design with those decisions in mind.
>>
>
> That is certainly not the purpose, which was why there were two docs, not
> just one SPIP. The purpose of the companion doc is just to give some
> concrete stories and estimate what could be done in Spark 3.0. Maybe we
> should update the SPIP doc and make it clear that certain features are
> pending follow-up discussions.
>
>
>>
>> What's the next step in your view, after this, and before it's
>> implemented? as long as there is one, sure, let's punt. Seems like we
>> could begin that conversation nowish.
>>
>
> We should assign each major component an "owner" who can lead the
> follow-up work, e.g.,
>
> * Internal changes to Spark scheduler
> * Interfaces to cluster managers and users
> * Standalone support
> * YARN support
> * K8s support
> * Mesos support
> * Test infrastructure
> * FPGA
>
> Again, for each component the question we should answer first is "Is it
> important?" and then "How to implement it?". Community members who are
> interested in each discussion should subscribe to the corresponding JIRA.
> If some committer think we need a follow-up SPIP, either to make more
> members aware of the changes or to reach agreement, feel free to call it
> out.
>
>
>>
>> Many of those questions you list are _fine_ for a SPIP, in my opinion.
>> (Of course, I'd add what cluster managers are in/out of scope.)
>>
>
> I think the two requires more discussion are Mesos and K8s. Let me follow
> what I suggested above and try to answer two questions for each:
>
> Mesos:
> * Is it important? There are certainly Spark/Mesos users but the overall
> usage is going downhill. See the attached Google Trend snapshot.
>

[image: Screen Shot 2019-03-04 at 8.10.50 AM.png]


> * How to implement it? I believe it is doable, similarly to other cluster
> managers. However, we need to find someone from our community to do the
> work. If we cannot find such a person, it would indicate that the feature
> is not that important.
>
> K8s:
> * Is it important? K8s is the fastest growing manager. But the current
> Spark support is experimental. Building features on top would add
> additional cost if we want to make changes.
> * How to implement it? There is a sketch in the companion doc. Yinan
> mentioned three options to expose the inferences to users. We need to
> finalize the design and discuss which option is the best to go.
>
> You see that such discussions can be done in parallel. It is not efficient
> if we block the work on K8s because we cannot decide whether we should
> support Mesos.
>
>
>>
>>
>> On Mon, Mar 4, 2019 at 9:07 AM Xiangrui Meng  wrote:
>> >
>> > What finer "high level" goals do you recommend? To make progress on the
>> vote, it would be great if you can articulate more. Current SPIP proposes
>> two high-level changes to make Spark accelerator-aware:
>> >
>> > At cluster manager level, we update or upgrade cluster managers to
>> include GPU support. Then we expose user interfaces for Spark to request
>> GPUs from them.
>> > Within Spark, we update its scheduler to understand available GPUs
>> allocated to executors, user task requests, and assign GPUs to tasks
>> properly.
>> >
>> > How do 

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-04 Thread Xiangrui Meng
On Mon, Mar 4, 2019 at 7:24 AM Sean Owen  wrote:

> To be clear, those goals sound fine to me. I don't think voting on
> those two broad points is meaningful, but, does no harm per se. If you
> mean this is just a check to see if people believe this is broadly
> worthwhile, then +1 from me. Yes it is.
>
> That means we'd want to review something more detailed later, whether
> it's a a) design doc we vote on or b) a series of pull requests. Given
> the number of questions this leaves open, a) sounds better and I think
> what you're suggesting. I'd call that the SPIP, but, so what, it's
> just a name. The thing is, a) seems already mostly done, in the second
> document that was attached.


It is far from done. We still need to review the APIs and the design for
each major component:

* Internal changes to Spark job scheduler.
* Interfaces exposed to users.
* Interfaces exposed to cluster managers.
* Standalone / auto-discovery.
* YARN
* K8s
* Mesos
* Jenkins

I try to avoid discussing each of them in this thread because they require
different domain experts. After we have a high-level agreement on adding
accelerator support to Spark. We can kick off the work in parallel. If any
committer thinks a follow-up work still needs an SPIP, we just follow the
SPIP process to resolve it.


> I'm hesitating because i'm not sure why
> it's important to not discuss that level of detail here, as it's
> already available. Just too much noise?


Yes. If we go down one or two levels, we might have to pull in different
domain experts for different questions.


> but voting for this seems like
> endorsing those decisions, as I can only assume the proposer is going
> to continue the design with those decisions in mind.
>

That is certainly not the purpose, which was why there were two docs, not
just one SPIP. The purpose of the companion doc is just to give some
concrete stories and estimate what could be done in Spark 3.0. Maybe we
should update the SPIP doc and make it clear that certain features are
pending follow-up discussions.


>
> What's the next step in your view, after this, and before it's
> implemented? as long as there is one, sure, let's punt. Seems like we
> could begin that conversation nowish.
>

We should assign each major component an "owner" who can lead the follow-up
work, e.g.,

* Internal changes to Spark scheduler
* Interfaces to cluster managers and users
* Standalone support
* YARN support
* K8s support
* Mesos support
* Test infrastructure
* FPGA

Again, for each component the question we should answer first is "Is it
important?" and then "How to implement it?". Community members who are
interested in each discussion should subscribe to the corresponding JIRA.
If some committer think we need a follow-up SPIP, either to make more
members aware of the changes or to reach agreement, feel free to call it
out.


>
> Many of those questions you list are _fine_ for a SPIP, in my opinion.
> (Of course, I'd add what cluster managers are in/out of scope.)
>

I think the two requires more discussion are Mesos and K8s. Let me follow
what I suggested above and try to answer two questions for each:

Mesos:
* Is it important? There are certainly Spark/Mesos users but the overall
usage is going downhill. See the attached Google Trend snapshot.
* How to implement it? I believe it is doable, similarly to other cluster
managers. However, we need to find someone from our community to do the
work. If we cannot find such a person, it would indicate that the feature
is not that important.

K8s:
* Is it important? K8s is the fastest growing manager. But the current
Spark support is experimental. Building features on top would add
additional cost if we want to make changes.
* How to implement it? There is a sketch in the companion doc. Yinan
mentioned three options to expose the inferences to users. We need to
finalize the design and discuss which option is the best to go.

You see that such discussions can be done in parallel. It is not efficient
if we block the work on K8s because we cannot decide whether we should
support Mesos.


>
>
> On Mon, Mar 4, 2019 at 9:07 AM Xiangrui Meng  wrote:
> >
> > What finer "high level" goals do you recommend? To make progress on the
> vote, it would be great if you can articulate more. Current SPIP proposes
> two high-level changes to make Spark accelerator-aware:
> >
> > At cluster manager level, we update or upgrade cluster managers to
> include GPU support. Then we expose user interfaces for Spark to request
> GPUs from them.
> > Within Spark, we update its scheduler to understand available GPUs
> allocated to executors, user task requests, and assign GPUs to tasks
> properly.
> >
> > How do you want to change or refine them? I saw you raised questions
> around Horovod requirements and GPU/memory allocation. But there are tens
> of questions at the same or even higher level. E.g., in preparing the
> companion scoping doc we saw the following questions:
> >
> > * How to test GPU 

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-04 Thread Sean Owen
To be clear, those goals sound fine to me. I don't think voting on
those two broad points is meaningful, but, does no harm per se. If you
mean this is just a check to see if people believe this is broadly
worthwhile, then +1 from me. Yes it is.

That means we'd want to review something more detailed later, whether
it's a a) design doc we vote on or b) a series of pull requests. Given
the number of questions this leaves open, a) sounds better and I think
what you're suggesting. I'd call that the SPIP, but, so what, it's
just a name. The thing is, a) seems already mostly done, in the second
document that was attached. I'm hesitating because i'm not sure why
it's important to not discuss that level of detail here, as it's
already available. Just too much noise? but voting for this seems like
endorsing those decisions, as I can only assume the proposer is going
to continue the design with those decisions in mind.

What's the next step in your view, after this, and before it's
implemented? as long as there is one, sure, let's punt. Seems like we
could begin that conversation nowish.

Many of those questions you list are _fine_ for a SPIP, in my opinion.
(Of course, I'd add what cluster managers are in/out of scope.)


On Mon, Mar 4, 2019 at 9:07 AM Xiangrui Meng  wrote:
>
> What finer "high level" goals do you recommend? To make progress on the vote, 
> it would be great if you can articulate more. Current SPIP proposes two 
> high-level changes to make Spark accelerator-aware:
>
> At cluster manager level, we update or upgrade cluster managers to include 
> GPU support. Then we expose user interfaces for Spark to request GPUs from 
> them.
> Within Spark, we update its scheduler to understand available GPUs allocated 
> to executors, user task requests, and assign GPUs to tasks properly.
>
> How do you want to change or refine them? I saw you raised questions around 
> Horovod requirements and GPU/memory allocation. But there are tens of 
> questions at the same or even higher level. E.g., in preparing the companion 
> scoping doc we saw the following questions:
>
> * How to test GPU support on Jenkins?
> * Does the solution proposed also work for FPGA? What are the diffs?
> * How to make standalone workers auto-discover GPU resources?
> * Do we want to allow users to request GPU resources in Pandas UDF?
> * How does user pass the GPU requests to K8s, spark-submit command-line or 
> pod template?
> * Do we create a separate queue for GPU task scheduling so it doesn't cause 
> regression on normal jobs?
> * How to monitor the utilization of GPU? At what levels?
> * Do we want to support GPU-backed physical operators?
> * Do we allow users to request both non-default number of CPUs and GPUs?
> * ...
>
> IMHO, we cannot nor we should answer questions at this level in this vote. 
> The vote is majorly on whether we should make Spark accelerator-aware to help 
> unify big data and AI solutions, specifically whether Spark should provide 
> proper support to deep learning model training and inference where 
> accelerators are essential. My +1 vote is based on the following logic:
>
> * It is important for Spark to become the de facto solution in connecting big 
> data and AI.
> * The work is doable given the design sketch and the early 
> investigation/scoping.
>
> To me, "-1" means either it is not important for Spark to support such use 
> cases or we certainly cannot afford to implement such support. This is my 
> understanding of the SPIP and the vote. It would be great if you can 
> elaborate what changes you want to make or what answers you want to see.
>

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-04 Thread Xiangrui Meng
What finer "high level" goals do you recommend? To make progress on the
vote, it would be great if you can articulate more. Current SPIP proposes
two high-level changes to make Spark accelerator-aware:

   - At cluster manager level, we update or upgrade cluster managers to
   include GPU support. Then we expose user interfaces for Spark to request
   GPUs from them.
   - Within Spark, we update its scheduler to understand available GPUs
   allocated to executors, user task requests, and assign GPUs to tasks
   properly.

How do you want to change or refine them? I saw you raised questions around
Horovod requirements and GPU/memory allocation. But there are tens of
questions at the same or even higher level. E.g., in preparing the
companion scoping doc we saw the following questions:

* How to test GPU support on Jenkins?
* Does the solution proposed also work for FPGA? What are the diffs?
* How to make standalone workers auto-discover GPU resources?
* Do we want to allow users to request GPU resources in Pandas UDF?
* How does user pass the GPU requests to K8s, spark-submit command-line or
pod template?
* Do we create a separate queue for GPU task scheduling so it doesn't cause
regression on normal jobs?
* How to monitor the utilization of GPU? At what levels?
* Do we want to support GPU-backed physical operators?
* Do we allow users to request both non-default number of CPUs and GPUs?
* ...

IMHO, we cannot nor we should answer questions at this level in this vote.
The vote is majorly on whether we should make Spark accelerator-aware to
help unify big data and AI solutions, specifically whether Spark should
provide proper support to deep learning model training and inference where
accelerators are essential. My +1 vote is based on the following logic:

* It is important for Spark to become the de facto solution in connecting
big data and AI.
* The work is doable given the design sketch and the early
investigation/scoping.

To me, "-1" means either it is not important for Spark to support such use
cases or we certainly cannot afford to implement such support. This is my
understanding of the SPIP and the vote. It would be great if you can
elaborate what changes you want to make or what answers you want to see.

On Sun, Mar 3, 2019 at 11:13 PM Felix Cheung 
wrote:

> Once again, I’d have to agree with Sean.
>
> Let’s table the meaning of SPIP for another time, say. I think a few of us
> are trying to understand what does “accelerator resource aware” mean.
>
As far as I know, no one is discussing API here. But on google doc, JIRA
> and on email and off list, I have seen questions, questions that are
> greatly concerning, like “oh scheduler is allocating GPU, but how does it
> affect memory” and many more, and so I think finer “high level” goals
> should be defined.
>
>
>
>
> --
> *From:* Sean Owen 
> *Sent:* Sunday, March 3, 2019 5:24 PM
> *To:* Xiangrui Meng
> *Cc:* Felix Cheung; Xingbo Jiang; Yinan Li; dev; Weichen Xu; Marco Gaido
> *Subject:* Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling
>
> I think treating SPIPs as this high-level takes away much of the point
> of VOTEing on them. I'm not sure that's even what Reynold is
> suggesting elsewhere; we're nowhere near discussing APIs here, just
> what 'accelerator aware' even generally means. If the scope isn't
> specified, what are we trying to bind with a formal VOTE? The worst I
> can say is that this doesn't mean much, so the outcome of the vote
> doesn't matter. The general ideas seems fine to me and I support
> _something_ like this.
>
> I think the subtext concern is that SPIPs become a way to request
> cover to make a bunch of decisions separately, later. This is, to some
> extent, how it has to work. A small number of interested parties need
> to decide the details coherently, not design the whole thing by
> committee, with occasional check-ins for feedback. There's a balance
> between that, and using the SPIP as a license to go finish a design
> and proclaim it later. That's not anyone's bad-faith intention, just
> the risk of deferring so much.
>
> Mesos support is not a big deal by itself but a fine illustration of
> the point. That seems like a fine question of scope now, even if the
> 'how' or some of the 'what' can be decided later. I raised an eyebrow
> here at the reply that this was already judged out-of-scope: how much
> are we on the same page about this being a point to consider feedback?
>
> If one wants to VOTE on more details, then this vote just doesn't
> matter much. Is a future step to VOTE on some more detailed design
> doc? Then that's what I call a "SPIP" and it's practically just
> semantics.
>
>
> On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng  wrote:
> >
> > Hi Felix,
> >
> >

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-03 Thread Felix Cheung
Once again, I’d have to agree with Sean.

Let’s table the meaning of SPIP for another time, say. I think a few of us are 
trying to understand what does “accelerator resource aware” mean. As far as I 
know, no one is discussing API here. But on google doc, JIRA and on email and 
off list, I have seen questions, questions that are greatly concerning, like 
“oh scheduler is allocating GPU, but how does it affect memory” and many more, 
and so I think finer “high level” goals should be defined.





From: Sean Owen 
Sent: Sunday, March 3, 2019 5:24 PM
To: Xiangrui Meng
Cc: Felix Cheung; Xingbo Jiang; Yinan Li; dev; Weichen Xu; Marco Gaido
Subject: Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

I think treating SPIPs as this high-level takes away much of the point
of VOTEing on them. I'm not sure that's even what Reynold is
suggesting elsewhere; we're nowhere near discussing APIs here, just
what 'accelerator aware' even generally means. If the scope isn't
specified, what are we trying to bind with a formal VOTE? The worst I
can say is that this doesn't mean much, so the outcome of the vote
doesn't matter. The general ideas seems fine to me and I support
_something_ like this.

I think the subtext concern is that SPIPs become a way to request
cover to make a bunch of decisions separately, later. This is, to some
extent, how it has to work. A small number of interested parties need
to decide the details coherently, not design the whole thing by
committee, with occasional check-ins for feedback. There's a balance
between that, and using the SPIP as a license to go finish a design
and proclaim it later. That's not anyone's bad-faith intention, just
the risk of deferring so much.

Mesos support is not a big deal by itself but a fine illustration of
the point. That seems like a fine question of scope now, even if the
'how' or some of the 'what' can be decided later. I raised an eyebrow
here at the reply that this was already judged out-of-scope: how much
are we on the same page about this being a point to consider feedback?

If one wants to VOTE on more details, then this vote just doesn't
matter much. Is a future step to VOTE on some more detailed design
doc? Then that's what I call a "SPIP" and it's practically just
semantics.


On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng  wrote:
>
> Hi Felix,
>
> Just to clarify, we are voting on the SPIP, not the companion scoping doc. 
> What is proposed and what we are voting on is to make Spark 
> accelerator-aware. The companion scoping doc and the design sketch are to 
> help demonstrate that what features could be implemented based on the use 
> cases and dev resources the co-authors are aware of. The exact scoping and 
> design would require more community involvement, by no means we are 
> finalizing it in this vote thread.
>
> I think copying the goals and non-goals from the companion scoping doc to the 
> SPIP caused the confusion. As mentioned in the SPIP, we proposed to make two 
> major changes at high level:
>
> At cluster manager level, we update or upgrade cluster managers to include 
> GPU support. Then we expose user interfaces for Spark to request GPUs from 
> them.
> Within Spark, we update its scheduler to understand available GPUs allocated 
> to executors, user task requests, and assign GPUs to tasks properly.
>
> We should keep our vote discussion at this level. It doesn't exclude 
> Mesos/Windows/TPU/FPGA, nor it commits to support YARN/K8s. Through the 
> initial scoping work, we found that we certainly need domain experts to 
> discuss the support of each cluster manager and each accelerator type. But 
> adding more details on Mesos or FPGA doesn't change the SPIP at high level. 
> So we concluded the initial scoping, shared the docs, and started this vote.


Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-03 Thread Sean Owen
I think treating SPIPs as this high-level takes away much of the point
of VOTEing on them. I'm not sure that's even what Reynold is
suggesting elsewhere; we're nowhere near discussing APIs here, just
what 'accelerator aware' even generally means. If the scope isn't
specified, what are we trying to bind with a formal VOTE? The worst I
can say is that this doesn't mean much, so the outcome of the vote
doesn't matter. The general ideas seems fine to me and I support
_something_ like this.

I think the subtext concern is that SPIPs become a way to request
cover to make a bunch of decisions separately, later. This is, to some
extent, how it has to work. A small number of interested parties need
to decide the details coherently, not design the whole thing by
committee, with occasional check-ins for feedback. There's a balance
between that, and using the SPIP as a license to go finish a design
and proclaim it later. That's not anyone's bad-faith intention, just
the risk of deferring so much.

Mesos support is not a big deal by itself but a fine illustration of
the point. That seems like a fine question of scope now, even if the
'how' or some of the 'what' can be decided later. I raised an eyebrow
here at the reply that this was already judged out-of-scope: how much
are we on the same page about this being a point to consider feedback?

If one wants to VOTE on more details, then this vote just doesn't
matter much. Is a future step to VOTE on some more detailed design
doc? Then that's what I call a "SPIP" and it's practically just
semantics.


On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng  wrote:
>
> Hi Felix,
>
> Just to clarify, we are voting on the SPIP, not the companion scoping doc. 
> What is proposed and what we are voting on is to make Spark 
> accelerator-aware. The companion scoping doc and the design sketch are to 
> help demonstrate that what features could be implemented based on the use 
> cases and dev resources the co-authors are aware of. The exact scoping and 
> design would require more community involvement, by no means we are 
> finalizing it in this vote thread.
>
> I think copying the goals and non-goals from the companion scoping doc to the 
> SPIP caused the confusion. As mentioned in the SPIP, we proposed to make two 
> major changes at high level:
>
> At cluster manager level, we update or upgrade cluster managers to include 
> GPU support. Then we expose user interfaces for Spark to request GPUs from 
> them.
> Within Spark, we update its scheduler to understand available GPUs allocated 
> to executors, user task requests, and assign GPUs to tasks properly.
>
> We should keep our vote discussion at this level. It doesn't exclude 
> Mesos/Windows/TPU/FPGA, nor it commits to support YARN/K8s. Through the 
> initial scoping work, we found that we certainly need domain experts to 
> discuss the support of each cluster manager and each accelerator type. But 
> adding more details on Mesos or FPGA doesn't change the SPIP at high level. 
> So we concluded the initial scoping, shared the docs, and started this vote.

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-03 Thread Xiangrui Meng
Hi Felix,

Just to clarify, we are voting on the SPIP, not the companion scoping doc.
What is proposed and what we are voting on is to make Spark
accelerator-aware. The companion scoping doc and the design sketch are to
help demonstrate that what features could be implemented based on the use
cases and dev resources the co-authors are aware of. The exact scoping and
design would require more community involvement, by no means we are
finalizing it in this vote thread.

I think copying the goals and non-goals from the companion scoping doc to
the SPIP caused the confusion. As mentioned in the SPIP, we proposed to
make two major changes at high level:

   - At cluster manager level, we update or upgrade cluster managers to
   include GPU support. Then we expose user interfaces for Spark to request
   GPUs from them.
   - Within Spark, we update its scheduler to understand available GPUs
   allocated to executors, user task requests, and assign GPUs to tasks
   properly.

We should keep our vote discussion at this level. It doesn't exclude
Mesos/Windows/TPU/FPGA, nor it commits to support YARN/K8s. Through the
initial scoping work, we found that we certainly need domain experts to
discuss the support of each cluster manager and each accelerator type. But
adding more details on Mesos or FPGA doesn't change the SPIP at high level.
So we concluded the initial scoping, shared the docs, and started this vote.

I suggest updating the goals and non-goals in the SPIP so we don't turn the
vote into discussing a specific cluster manager support or non-support.
After we reach a high-level agreement, the work can be fairly distributed.
If there are both strong demand and dev resources from the community for a
specific cluster manager or an accelerator type, I don't see why we should
block the work. If the work requires more discussion, we can start a new
SPIP thread.

Also see my inline comments below.

On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung 
wrote:

> Great points Sean.
>
> Here’s what I’d like to suggest to move forward.
> Split the SPIP.
>
> If we want to propose upfront homogeneous allocation (aka
> spark.task.gpus), this should be one on its own and for instance,
>

This is more like an API/design discussion, which can be done after the
vote. I don't think the feature alone needs a separate SPIP thread. On the
high level, spark users should be able to request and use GPUs properly.
How to implement is pending the design.


> I really agree with Sean (like I did in the discuss thread) that we can’t
> simply non-goal Mesos. We have enough maintenance issue as it is. And IIRC
> there was a PR proposed for K8S that I’d like to see bring that discussion
> here as well.
>

+1. As I mentioned above, discussing each cluster manager support requires
domain experts. The goals and non-goals in the SPIP caused this confusion.
I suggest updating the goals and non-goals and then having separate
discussion for each that doesn't block the main SPIP vote. It would be
great if you or Sean can lead the discussion on Mesos support.


>
> IMO upfront allocation is less useful. Specifically too expensive for
> large jobs.
>

This is also an API/design discussion.


>
> If we want per-stage resource request, this should a full SPIP with a lot
> more details to be hashed out. Our work with Horovod brings a few specific
> and critical requirements on how this should work with distributed DL and I
> would like to see those addressed.
>

SPIP is designed to not have a lot details. I agree with what Reynold said
on the Table Metadata thread:

"""
In general it'd be better to have the SPIPs be higher level, and put the
detailed APIs in a separate doc. Alternatively, put them in the SPIP but
explicitly vote on the high level stuff and not the detailed APIs.
"""

Could you create a JIRA and document the list of requirements from Horovod
use cases?


>
> In any case I’d like to see more consensus before moving forward, until
> then I’m going to -1 this.
>
>
>
> --
> *From:* Sean Owen 
> *Sent:* Sunday, March 3, 2019 8:15 AM
> *To:* Felix Cheung
> *Cc:* Xingbo Jiang; Yinan Li; dev; Weichen Xu; Marco Gaido
> *Subject:* Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling
>
> I'm for this in general, at least a +0. I do think this has to have a
> story for what to do with the existing Mesos GPU support, which sounds
> entirely like the spark.task.gpus config here. Maybe it's just a
> synonym? that kind of thing.
>
> Requesting different types of GPUs might be a bridge too far, but,
> that's a P2 detail that can be hashed out later. (For example, if a
> v100 is available and k80 was requested, do you use it or fail? is the
> right level of resource control GPU RAM and cores?)
>
> The per-stage resource requirements sounds like the biggest chan

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-03 Thread Felix Cheung
Great points Sean.

Here’s what I’d like to suggest to move forward.
Split the SPIP.

If we want to propose upfront homogeneous allocation (aka spark.task.gpus), 
this should be one on its own and for instance, I really agree with Sean (like 
I did in the discuss thread) that we can’t simply non-goal Mesos. We have 
enough maintenance issue as it is. And IIRC there was a PR proposed for K8S 
that I’d like to see bring that discussion here as well.

IMO upfront allocation is less useful. Specifically too expensive for large 
jobs.

If we want per-stage resource request, this should a full SPIP with a lot more 
details to be hashed out. Our work with Horovod brings a few specific and 
critical requirements on how this should work with distributed DL and I would 
like to see those addressed.

In any case I’d like to see more consensus before moving forward, until then 
I’m going to -1 this.




From: Sean Owen 
Sent: Sunday, March 3, 2019 8:15 AM
To: Felix Cheung
Cc: Xingbo Jiang; Yinan Li; dev; Weichen Xu; Marco Gaido
Subject: Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

I'm for this in general, at least a +0. I do think this has to have a
story for what to do with the existing Mesos GPU support, which sounds
entirely like the spark.task.gpus config here. Maybe it's just a
synonym? that kind of thing.

Requesting different types of GPUs might be a bridge too far, but,
that's a P2 detail that can be hashed out later. (For example, if a
v100 is available and k80 was requested, do you use it or fail? is the
right level of resource control GPU RAM and cores?)

The per-stage resource requirements sounds like the biggest change;
you can even change CPU cores requested per pandas UDF? and what about
memory then? We'll see how that shakes out. That's the only thing I'm
kind of unsure about in this proposal.

On Sat, Mar 2, 2019 at 9:35 PM Felix Cheung  wrote:
>
> I’m very hesitant with this.
>
> I don’t want to vote -1, because I personally think it’s important to do, but 
> I’d like to see more discussion points addressed and not voting completely on 
> the spirit of it.
>
> First, SPIP doesn’t match the format of SPIP proposed and agreed on. (Maybe 
> this is a minor point and perhaps we should also vote to update the SPIP 
> format)
>
> Second, there are multiple pdf/google doc and JIRA. And I think for example 
> the design sketch is not covering the same points as the updated SPIP doc? It 
> would help to make them align before moving forward.
>
> Third, the proposal touches on some fairly core and sensitive components, 
> like the scheduler, and I think more discussions are necessary. We have a few 
> comments there and in the JIRA.
>
>
>
> 
> From: Marco Gaido 
> Sent: Saturday, March 2, 2019 4:18 AM
> To: Weichen Xu
> Cc: Yinan Li; Tom Graves; dev; Xingbo Jiang
> Subject: Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling
>
> +1, a critical feature for AI/DL!
>
> Il giorno sab 2 mar 2019 alle ore 05:14 Weichen Xu 
>  ha scritto:
>>
>> +1, nice feature!
>>
>> On Sat, Mar 2, 2019 at 6:11 AM Yinan Li  wrote:
>>>
>>> +1
>>>
>>> On Fri, Mar 1, 2019 at 12:37 PM Tom Graves  
>>> wrote:
>>>>
>>>> +1 for the SPIP.
>>>>
>>>> Tom
>>>>
>>>> On Friday, March 1, 2019, 8:14:43 AM CST, Xingbo Jiang 
>>>>  wrote:
>>>>
>>>>
>>>> Hi all,
>>>>
>>>> I want to call for a vote of SPARK-24615. It improves Spark by making it 
>>>> aware of GPUs exposed by cluster managers, and hence Spark can match GPU 
>>>> resources with user task requests properly. The proposal and production 
>>>> doc was made available on dev@ to collect input. Your can also find a 
>>>> design sketch at SPARK-27005.
>>>>
>>>> The vote will be up for the next 72 hours. Please reply with your vote:
>>>>
>>>> +1: Yeah, let's go forward and implement the SPIP.
>>>> +0: Don't really care.
>>>> -1: I don't think this is a good idea because of the following technical 
>>>> reasons.
>>>>
>>>> Thank you!
>>>>
>>>> Xingbo


Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-03 Thread Sean Owen
I'm for this in general, at least a +0. I do think this has to have a
story for what to do with the existing Mesos GPU support, which sounds
entirely like the spark.task.gpus config here. Maybe it's just a
synonym? that kind of thing.

Requesting different types of GPUs might be a bridge too far, but,
that's a P2 detail that can be hashed out later. (For example, if a
v100 is available and k80 was requested, do you use it or fail? is the
right level of resource control GPU RAM and cores?)

The per-stage resource requirements sounds like the biggest change;
you can even change CPU cores requested per pandas UDF? and what about
memory then? We'll see how that shakes out. That's the only thing I'm
kind of unsure about in this proposal.

On Sat, Mar 2, 2019 at 9:35 PM Felix Cheung  wrote:
>
> I’m very hesitant with this.
>
> I don’t want to vote -1, because I personally think it’s important to do, but 
> I’d like to see more discussion points addressed and not voting completely on 
> the spirit of it.
>
> First, SPIP doesn’t match the format of SPIP proposed and agreed on. (Maybe 
> this is a minor point and perhaps we should also vote to update the SPIP 
> format)
>
> Second, there are multiple pdf/google doc and JIRA. And I think for example 
> the design sketch is not covering the same points as the updated SPIP doc? It 
> would help to make them align before moving forward.
>
> Third, the proposal touches on some fairly core and sensitive components, 
> like the scheduler, and I think more discussions are necessary. We have a few 
> comments there and in the JIRA.
>
>
>
> 
> From: Marco Gaido 
> Sent: Saturday, March 2, 2019 4:18 AM
> To: Weichen Xu
> Cc: Yinan Li; Tom Graves; dev; Xingbo Jiang
> Subject: Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling
>
> +1, a critical feature for AI/DL!
>
> Il giorno sab 2 mar 2019 alle ore 05:14 Weichen Xu 
>  ha scritto:
>>
>> +1, nice feature!
>>
>> On Sat, Mar 2, 2019 at 6:11 AM Yinan Li  wrote:
>>>
>>> +1
>>>
>>> On Fri, Mar 1, 2019 at 12:37 PM Tom Graves  
>>> wrote:
>>>>
>>>> +1 for the SPIP.
>>>>
>>>> Tom
>>>>
>>>> On Friday, March 1, 2019, 8:14:43 AM CST, Xingbo Jiang 
>>>>  wrote:
>>>>
>>>>
>>>> Hi all,
>>>>
>>>> I want to call for a vote of SPARK-24615. It improves Spark by making it 
>>>> aware of GPUs exposed by cluster managers, and hence Spark can match GPU 
>>>> resources with user task requests properly. The proposal and production 
>>>> doc was made available on dev@ to collect input. Your can also find a 
>>>> design sketch at SPARK-27005.
>>>>
>>>> The vote will be up for the next 72 hours. Please reply with your vote:
>>>>
>>>> +1: Yeah, let's go forward and implement the SPIP.
>>>> +0: Don't really care.
>>>> -1: I don't think this is a good idea because of the following technical 
>>>> reasons.
>>>>
>>>> Thank you!
>>>>
>>>> Xingbo

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-02 Thread Felix Cheung
I’m very hesitant with this.

I don’t want to vote -1, because I personally think it’s important to do, but 
I’d like to see more discussion points addressed and not voting completely on 
the spirit of it.

First, SPIP doesn’t match the format of SPIP proposed and agreed on. (Maybe 
this is a minor point and perhaps we should also vote to update the SPIP format)

Second, there are multiple pdf/google doc and JIRA. And I think for example the 
design sketch is not covering the same points as the updated SPIP doc? It would 
help to make them align before moving forward.

Third, the proposal touches on some fairly core and sensitive components, like 
the scheduler, and I think more discussions are necessary. We have a few 
comments there and in the JIRA.




From: Marco Gaido 
Sent: Saturday, March 2, 2019 4:18 AM
To: Weichen Xu
Cc: Yinan Li; Tom Graves; dev; Xingbo Jiang
Subject: Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

+1, a critical feature for AI/DL!

Il giorno sab 2 mar 2019 alle ore 05:14 Weichen Xu 
mailto:weichen...@databricks.com>> ha scritto:
+1, nice feature!

On Sat, Mar 2, 2019 at 6:11 AM Yinan Li 
mailto:liyinan...@gmail.com>> wrote:
+1

On Fri, Mar 1, 2019 at 12:37 PM Tom Graves  wrote:
+1 for the SPIP.

Tom

On Friday, March 1, 2019, 8:14:43 AM CST, Xingbo Jiang 
mailto:jiangxb1...@gmail.com>> wrote:


Hi all,

I want to call for a vote of 
SPARK-24615<https://issues.apache.org/jira/browse/SPARK-24615>. It improves 
Spark by making it aware of GPUs exposed by cluster managers, and hence Spark 
can match GPU resources with user task requests properly. The 
proposal<https://docs.google.com/document/d/1C4J_BPOcSCJc58HL7JfHtIzHrjU0rLRdQM3y7ejil64/edit?usp=sharing>
 and production 
doc<https://docs.google.com/document/d/12JjloksHCdslMXhdVZ3xY5l1Nde3HRhIrqvzGnK_bNE/edit?usp=sharing>
 was made available on dev@ to collect input. Your can also find a design 
sketch at SPARK-27005<https://issues.apache.org/jira/browse/SPARK-27005>.

The vote will be up for the next 72 hours. Please reply with your vote:

+1: Yeah, let's go forward and implement the SPIP.
+0: Don't really care.
-1: I don't think this is a good idea because of the following technical 
reasons.

Thank you!

Xingbo


Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-02 Thread Marco Gaido
+1, a critical feature for AI/DL!

Il giorno sab 2 mar 2019 alle ore 05:14 Weichen Xu <
weichen...@databricks.com> ha scritto:

> +1, nice feature!
>
> On Sat, Mar 2, 2019 at 6:11 AM Yinan Li  wrote:
>
>> +1
>>
>> On Fri, Mar 1, 2019 at 12:37 PM Tom Graves 
>> wrote:
>>
>>> +1 for the SPIP.
>>>
>>> Tom
>>>
>>> On Friday, March 1, 2019, 8:14:43 AM CST, Xingbo Jiang <
>>> jiangxb1...@gmail.com> wrote:
>>>
>>>
>>> Hi all,
>>>
>>> I want to call for a vote of SPARK-24615
>>> . It improves Spark
>>> by making it aware of GPUs exposed by cluster managers, and hence Spark can
>>> match GPU resources with user task requests properly. The proposal
>>> 
>>>  and production doc
>>> 
>>>  was
>>> made available on dev@ to collect input. Your can also find a design
>>> sketch at SPARK-27005
>>> .
>>>
>>> The vote will be up for the next 72 hours. Please reply with your vote:
>>>
>>> +1: Yeah, let's go forward and implement the SPIP.
>>> +0: Don't really care.
>>> -1: I don't think this is a good idea because of the following technical
>>> reasons.
>>>
>>> Thank you!
>>>
>>> Xingbo
>>>
>>


Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-01 Thread Weichen Xu
+1, nice feature!

On Sat, Mar 2, 2019 at 6:11 AM Yinan Li  wrote:

> +1
>
> On Fri, Mar 1, 2019 at 12:37 PM Tom Graves 
> wrote:
>
>> +1 for the SPIP.
>>
>> Tom
>>
>> On Friday, March 1, 2019, 8:14:43 AM CST, Xingbo Jiang <
>> jiangxb1...@gmail.com> wrote:
>>
>>
>> Hi all,
>>
>> I want to call for a vote of SPARK-24615
>> . It improves Spark
>> by making it aware of GPUs exposed by cluster managers, and hence Spark can
>> match GPU resources with user task requests properly. The proposal
>> 
>>  and production doc
>> 
>>  was
>> made available on dev@ to collect input. Your can also find a design
>> sketch at SPARK-27005 
>> .
>>
>> The vote will be up for the next 72 hours. Please reply with your vote:
>>
>> +1: Yeah, let's go forward and implement the SPIP.
>> +0: Don't really care.
>> -1: I don't think this is a good idea because of the following technical
>> reasons.
>>
>> Thank you!
>>
>> Xingbo
>>
>


Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-01 Thread Wenchen Fan
+1

On Sat, Mar 2, 2019 at 6:11 AM Yinan Li  wrote:

> +1
>
> On Fri, Mar 1, 2019 at 12:37 PM Tom Graves 
> wrote:
>
>> +1 for the SPIP.
>>
>> Tom
>>
>> On Friday, March 1, 2019, 8:14:43 AM CST, Xingbo Jiang <
>> jiangxb1...@gmail.com> wrote:
>>
>>
>> Hi all,
>>
>> I want to call for a vote of SPARK-24615
>> . It improves Spark
>> by making it aware of GPUs exposed by cluster managers, and hence Spark can
>> match GPU resources with user task requests properly. The proposal
>> 
>>  and production doc
>> 
>>  was
>> made available on dev@ to collect input. Your can also find a design
>> sketch at SPARK-27005 
>> .
>>
>> The vote will be up for the next 72 hours. Please reply with your vote:
>>
>> +1: Yeah, let's go forward and implement the SPIP.
>> +0: Don't really care.
>> -1: I don't think this is a good idea because of the following technical
>> reasons.
>>
>> Thank you!
>>
>> Xingbo
>>
>


Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-01 Thread Yinan Li
+1

On Fri, Mar 1, 2019 at 12:37 PM Tom Graves 
wrote:

> +1 for the SPIP.
>
> Tom
>
> On Friday, March 1, 2019, 8:14:43 AM CST, Xingbo Jiang <
> jiangxb1...@gmail.com> wrote:
>
>
> Hi all,
>
> I want to call for a vote of SPARK-24615
> . It improves Spark by
> making it aware of GPUs exposed by cluster managers, and hence Spark can
> match GPU resources with user task requests properly. The proposal
> 
>  and production doc
> 
>  was
> made available on dev@ to collect input. Your can also find a design
> sketch at SPARK-27005 .
>
> The vote will be up for the next 72 hours. Please reply with your vote:
>
> +1: Yeah, let's go forward and implement the SPIP.
> +0: Don't really care.
> -1: I don't think this is a good idea because of the following technical
> reasons.
>
> Thank you!
>
> Xingbo
>


Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-01 Thread Tom Graves
 +1 for the SPIP.
Tom
On Friday, March 1, 2019, 8:14:43 AM CST, Xingbo Jiang 
 wrote:  
 
 Hi all,
I want to call for a vote of SPARK-24615. It improves Spark by making it aware 
of GPUs exposed by cluster managers, and hence Spark can match GPU resources 
with user task requests properly. The proposal and production doc was made 
available on dev@ to collect input. Your can also find a design sketch at 
SPARK-27005.
The vote will be up for the next 72 hours. Please reply with your vote:

+1: Yeah, let's go forward and implement the SPIP.
+0: Don't really care.
-1: I don't think this is a good idea because of the following technical 
reasons.

Thank you!
Xingbo  

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-01 Thread Xiangrui Meng
+1

Btw, as Ryan pointed out las time, +0 doesn't mean "Don't really care."
Official definitions here:

https://www.apache.org/foundation/voting.html#expressing-votes-1-0-1-and-fractions


   -

   +0: 'I don't feel strongly about it, but I'm okay with this.'
   -

   -0: 'I won't get in the way, but I'd rather we didn't do this.'


On Fri, Mar 1, 2019 at 6:27 AM Mingjie  wrote:

> +1
>
> mingjie
>
> On Mar 1, 2019, at 10:18 PM, Xingbo Jiang  wrote:
>
> Start with +1 from myself.
>
> Xingbo Jiang  于2019年3月1日周五 下午10:14写道:
>
>> Hi all,
>>
>> I want to call for a vote of SPARK-24615
>> . It improves Spark
>> by making it aware of GPUs exposed by cluster managers, and hence Spark can
>> match GPU resources with user task requests properly. The proposal
>> 
>>  and production doc
>> 
>>  was
>> made available on dev@ to collect input. Your can also find a design
>> sketch at SPARK-27005 
>> .
>>
>> The vote will be up for the next 72 hours. Please reply with your vote:
>>
>> +1: Yeah, let's go forward and implement the SPIP.
>> +0: Don't really care.
>> -1: I don't think this is a good idea because of the following technical
>> reasons.
>>
>> Thank you!
>>
>> Xingbo
>>
>


Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-01 Thread Mingjie
+1 

mingjie

> On Mar 1, 2019, at 10:18 PM, Xingbo Jiang  wrote:
> 
> Start with +1 from myself.
> 
> Xingbo Jiang  于2019年3月1日周五 下午10:14写道:
>> Hi all,
>> 
>> I want to call for a vote of SPARK-24615. It improves Spark by making it 
>> aware of GPUs exposed by cluster managers, and hence Spark can match GPU 
>> resources with user task requests properly. The proposal and production doc 
>> was made available on dev@ to collect input. Your can also find a design 
>> sketch at SPARK-27005.
>> 
>> The vote will be up for the next 72 hours. Please reply with your vote:
>> 
>> +1: Yeah, let's go forward and implement the SPIP.
>> +0: Don't really care.
>> -1: I don't think this is a good idea because of the following technical 
>> reasons.
>> 
>> Thank you!
>> 
>> Xingbo


Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-01 Thread Xingbo Jiang
Start with +1 from myself.

Xingbo Jiang  于2019年3月1日周五 下午10:14写道:

> Hi all,
>
> I want to call for a vote of SPARK-24615
> . It improves Spark by
> making it aware of GPUs exposed by cluster managers, and hence Spark can
> match GPU resources with user task requests properly. The proposal
> 
>  and production doc
> 
>  was
> made available on dev@ to collect input. Your can also find a design
> sketch at SPARK-27005 .
>
> The vote will be up for the next 72 hours. Please reply with your vote:
>
> +1: Yeah, let's go forward and implement the SPIP.
> +0: Don't really care.
> -1: I don't think this is a good idea because of the following technical
> reasons.
>
> Thank you!
>
> Xingbo
>


[VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-01 Thread Xingbo Jiang
Hi all,

I want to call for a vote of SPARK-24615
. It improves Spark by
making it aware of GPUs exposed by cluster managers, and hence Spark can
match GPU resources with user task requests properly. The proposal

 and production doc

was
made available on dev@ to collect input. Your can also find a design sketch
at SPARK-27005 .

The vote will be up for the next 72 hours. Please reply with your vote:

+1: Yeah, let's go forward and implement the SPIP.
+0: Don't really care.
-1: I don't think this is a good idea because of the following technical
reasons.

Thank you!

Xingbo