Re: [DISCUSSION] SPIP: Support Volcano/Alternative Schedulers Proposal

Weiwei Yang Tue, 30 Nov 2021 15:42:44 -0800

IIUC, PodGroup is only supported in Volcano, this is not a common API
adopted by K8s, at least not today.
Spark needs to be agnostic about the schedulers, as an example, when we run
Spark on YARN, does Spark need to know if that is FairScheduler or
CapacityScheduler?
IMO, we should build things general enough in Spark in order to support
different schedulers, instead of having extra effort to support them one by
one.


On Tue, Nov 30, 2021 at 2:18 PM Mich Talebzadeh <[email protected]>
wrote:

> Hi,
>
> Well, in mitigation, one cannot address all the available scheduler
> options at once. Certainly PodGroup is an option, unless there are reasons
> to believe that this is not a right choice. Therefore, I stand corrected, I
> fail to see where problematic comes into it, unless you may care to
> elaborate your concerns.
>
>
> HTH
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Tue, 30 Nov 2021 at 21:57, Weiwei Yang <[email protected]> wrote:
>
>> Hi Chenya
>>
>> Thanks for bringing this up. This is quite interesting, we definitely
>> should participate more in the discussions.
>> The main thing here is, the Spark community should make Spark pluggable
>> in order to support other schedulers, not just for Volcano. It looks like
>> this proposal is pushing really hard for adopting PodGroup, which isn't
>> part of K8s yet, that to me is problematic.
>>
>> On Tue, Nov 30, 2021 at 9:21 AM Prasad Paravatha <
>> [email protected]> wrote:
>>
>>> This is a great feature/idea.
>>> I'd love to get involved in some form (testing and/or documentation).
>>> This could be my 1st contribution to Spark!
>>>
>>> On Tue, Nov 30, 2021 at 10:46 PM John Zhuge <[email protected]> wrote:
>>>
>>>> +1 Kudos to Yikun and the community for starting the discussion!
>>>>
>>>> On Tue, Nov 30, 2021 at 8:47 AM Chenya Zhang <
>>>> [email protected]> wrote:
>>>>
>>>>> Thanks folks for bringing up the topic of natively integrating Volcano
>>>>> and other alternative schedulers into Spark!
>>>>>
>>>>> +Weiwei, Wilfred, Chaoran. We would love to contribute to the
>>>>> discussion as well.
>>>>>
>>>>> From our side, we have been using and improving on one alternative
>>>>> resource scheduler, Apache YuniKorn (https://yunikorn.apache.org/),
>>>>> for Spark on Kubernetes in production at Apple with solid results in the
>>>>> past year. It is capable of supporting Gang scheduling (similar to
>>>>> PodGroups), multi-tenant resource queues (similar to YARN), FIFO, and 
>>>>> other
>>>>> handy features like bin packing to enable efficient autoscaling, etc.
>>>>>
>>>>> Natively integrating with Spark would provide more flexibility for
>>>>> users and reduce the extra cost and potential inconsistency of maintaining
>>>>> different layers of resource strategies. One interesting topic we hope to
>>>>> discuss more about is dynamic allocation, which would benefit from native
>>>>> coordination between Spark and resource schedulers in K8s &
>>>>> cloud environment for an optimal resource efficiency.
>>>>>
>>>>>
>>>>> On Tue, Nov 30, 2021 at 8:10 AM Holden Karau <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Thanks for putting this together, I’m really excited for us to add
>>>>>> better batch scheduling integrations.
>>>>>>
>>>>>> On Tue, Nov 30, 2021 at 12:46 AM Yikun Jiang <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hey everyone,
>>>>>>>
>>>>>>> I'd like to start a discussion on "Support Volcano/Alternative
>>>>>>> Schedulers Proposal".
>>>>>>>
>>>>>>> This SPIP is proposed to make spark k8s schedulers provide more YARN
>>>>>>> like features (such as queues and minimum resources before scheduling 
>>>>>>> jobs)
>>>>>>> that many folks want on Kubernetes.
>>>>>>>
>>>>>>> The goal of this SPIP is to improve current spark k8s scheduler
>>>>>>> implementations, add the ability of batch scheduling and support 
>>>>>>> volcano as
>>>>>>> one of implementations.
>>>>>>>
>>>>>>> Design doc:
>>>>>>> https://docs.google.com/document/d/1xgQGRpaHQX6-QH_J9YV2C2Dh6RpXefUpLM7KGkzL6Fg
>>>>>>> JIRA: https://issues.apache.org/jira/browse/SPARK-36057
>>>>>>> Part of PRs:
>>>>>>> Ability to create resources
>>>>>>> https://github.com/apache/spark/pull/34599
>>>>>>> Add PodGroupFeatureStep: https://github.com/apache/spark/pull/34456
>>>>>>>
>>>>>>> Regards,
>>>>>>> Yikun
>>>>>>>
>>>>>> --
>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>
>>>>>
>>>>
>>>> --
>>>> John Zhuge
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Prasad Paravatha
>>>
>>

Re: [DISCUSSION] SPIP: Support Volcano/Alternative Schedulers Proposal

Reply via email to