Re: SPIP: Accelerator-aware Scheduling

Sean Owen Fri, 01 Mar 2019 09:10:28 -0800

Sounds like a good reason to get in Hadoop 3.1 support.
I guess my point is that Spark's Mesos GPU integration has already
existed for a long while. It doesn't necessarily need to be expanded,
but, seems like it must fit in to the more general framework here.
That might be little or no effort, just want to make sure there aren't
2 different ways GPUs are supported


On Fri, Mar 1, 2019 at 9:48 AM Xingbo Jiang <[email protected]> wrote:
>
> Hi Sean,
>
> To support GPU scheduling with YARN cluster, we have to update the hadoop 
> version to 3.1.2+. However, if we decide to not upgrade hadoop to beyond that 
> version for Spark 3.0, then we just have to disable/fallback the GPU 
> scheduling with YARN, users shall still be able to have that feature with 
> Standalone or Kubernetes cluster.
>
> We didn't include the Mesos support in current SPIP because we didn't receive 
> use cases that require GPU scheduling on Mesos cluster, however, we can still 
> add Mesos support in the future if we observe valid use cases.
>
> Thanks!
>
> Xingbo
>
> Sean Owen <[email protected]> 于2019年3月1日周五 下午10:39写道：
>>
>> Two late breaking questions:
>>
>> This basically requires Hadoop 3.1 for YARN support?
>> Mesos support is listed as a non goal but it already has support for 
>> requesting GPUs in Spark. That would be 'harmonized' with this 
>> implementation even if it's not extended?
>>
>> On Fri, Mar 1, 2019, 7:48 AM Xingbo Jiang <[email protected]> wrote:
>>>
>>> I think we are aligned on the commitment, I'll start a vote thread for this 
>>> shortly.
>>>
>>> Xiangrui Meng <[email protected]> 于2019年2月27日周三 上午6:47写道：
>>>>
>>>> In case there are issues visiting Google doc, I attached PDF files to the 
>>>> JIRA.
>>>>
>>>> On Tue, Feb 26, 2019 at 7:41 AM Xingbo Jiang <[email protected]> wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I want send a revised SPIP on implementing Accelerator(GPU)-aware 
>>>>> Scheduling. It improves Spark by making it aware of GPUs exposed by 
>>>>> cluster managers, and hence Spark can match GPU resources with user task 
>>>>> requests properly. If you have scenarios that need to run 
>>>>> workloads(DL/ML/Signal Processing etc.) on Spark cluster with GPU nodes, 
>>>>> please help review and check how it fits into your use cases. Your 
>>>>> feedback would be greatly appreciated!
>>>>>
>>>>> # Links to SPIP and Product doc:
>>>>>
>>>>> * Jira issue for the SPIP: 
>>>>> https://issues.apache.org/jira/browse/SPARK-24615
>>>>> * Google Doc: 
>>>>> https://docs.google.com/document/d/1C4J_BPOcSCJc58HL7JfHtIzHrjU0rLRdQM3y7ejil64/edit?usp=sharing
>>>>> * Product Doc: 
>>>>> https://docs.google.com/document/d/12JjloksHCdslMXhdVZ3xY5l1Nde3HRhIrqvzGnK_bNE/edit?usp=sharing
>>>>>
>>>>> Thank you!
>>>>>
>>>>> Xingbo

---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]

Re: SPIP: Accelerator-aware Scheduling

Reply via email to