Re: Recap on current status of "SPIP: Support Customized Kubernetes Schedulers"

2022-02-23 Thread Weiwei Yang
Thank you, Yikun.
I am working on SPARK-37809
 and SPARK-38310
. They are the major
stuff for the yunikorn part.
Keep in mind we also need to add the documents.
Thanks for building up the common things, great work.

On Wed, Feb 23, 2022 at 7:35 PM Yikun Jiang  wrote:

> First, much thanks for all your help (Spark/Volcano/Yunikorn community) to
> make this SPIP happen!
>
> Especially,@dongjoon-hyun @holdenk @william-wang @attilapiros @HyukjinKwon
> @martin-g @yangwwei @tgravescs
>
> The SPIP is near the end of the stage. It can be said that it is beta
> available at the basic level.
>
> I also draft a simple slide to show how to use and help you understand
> what we have done:
>
> https://docs.google.com/presentation/d/1XDsTWPcsBe4PQ-1MlBwd9pRl8mySdziE_dJE6iATNw8
>
> Below are also some recap to help you understand current implementation
> and next step on SPIP:
>
> *# Existing work*
> *## Basic part:*
> - SPARK-36059  *New
> configuration:* ability to specify "schedulerName" in driver/executor for
> Spark on K8S
> - SPARK-37331  *New
> workflow:*ability to create pre-populated resources before driver pod
>  for Spark on K8S
> - SPARK-37145  *New
> developer API:* support user feature step with configuration for Spark on
> K8S
> - *(reviewing)* *New Job Configurations* for Spark on K8S:
>   - SPARK-38188 :
> spark.kubernetes.job.queue
>   - SPARK-38187 :
> spark.kubernetes.job.[minCPU|minMemory]
>   - SPARK-38189 :
> spark.kubernetes.job.priorityClassName
>
> *## Volcano Part:*
> - SPARK-37258  *New
> volcano extension* in kubernetes-client fabric8io/kubernetes-client#3579
> - SPARK-36061  *New
> profile: *-Pvolcano
> - SPARK-36061  *New
> Feature Step:* VolcanoFeatureStep
> - SPARK-36061  *New
> integration test:*
>  *- Passed on x86 and Arm64 (Linux on Huawei Kunpeng 920 and MacOS on
> Apple Silicon M1).*
>  - Test basic volcano workflow
>  - Test all existing tests based on the volcano.
>
> *## Yunikorn Part:*
> @yangwwei  will also make the efforts for Yunikorn module feature step
> since this week.
> I will help to complete the yunikorn integration based on previous
> experience.
>
> *# Next Plan*
> There are also 3 main tasks to be completed before v3.3 code freeze:
> 1. (reviewing) SPARK-38188
> : Support queue
> scheduling configuration
> https://github.com/apache/spark/pull/35553
> 2. (reviewing) SPARK-38187
> : Support resource
> reservation (minCPU/minMemory configuration)
> https://github.com/apache/spark/pull/35640
> 3. (reviewing) SPARK-38187
> : Support priority
> scheduling (priorityClass configuration):
> https://issues.apache.org/jira/browse/SPARK-38189
> https://github.com/apache/spark/pull/35639
> 4. (WIP) SPARK-37809 :
> Yunikorn integration
>
> Also several misc work is gonna be completed before 3.3:
> 1. Integrated volcano deploy into integration test (x86 and arm)
> - Add it to spark kubernetes integration test once cross compile support:
> https://github.com/volcano-sh/volcano/pull/1571
> 2. Complete doc and test guideline.
>
> Please feel free to contact me if you have any other concerns! Thanks!
>
> [1] https://issues.apache.org/jira/browse/SPARK-36057
>


Re: [VOTE][SPIP] Support Customized Kubernetes Schedulers Proposal

2022-01-05 Thread Weiwei Yang
+1 (non-binding)

On Wed, Jan 5, 2022 at 5:09 PM Chaoran Yu  wrote:

> +1 (non-binding). Thanks Yikun for putting it all together
>
> On Wed, Jan 5, 2022 at 5:07 PM Yikun Jiang  wrote:
>
>> Hi all,
>>
>> I’d like to start a vote for SPIP: "Support Customized Kubernetes
>> Schedulers Proposal"
>>
>> The SPIP is to support customized Kubernetes schedulers in Spark on
>> Kubernetes.
>>
>> Please also refer to:
>>
>> - Previous discussion in dev mailing list: [DISCUSSION] SPIP: Support
>> Volcano/Alternative Schedulers Proposal
>> 
>> - Design doc: [SPIP] Spark-36057 Support Customized Kubernetes
>> Schedulers Proposal
>> 
>> - JIRA: SPARK-36057 
>>
>> Please vote on the SPIP:
>>
>> [ ] +1: Accept the proposal as an official SPIP
>> [ ] +0
>> [ ] -1: I don’t think this is a good idea because …
>>
>> Regards,
>> Yikun
>>
>


Re: [DISCUSSION] SPIP: Support Volcano/Alternative Schedulers Proposal

2022-01-05 Thread Weiwei Yang
resource
>>>>>>> reservation feature in case of insufficient resource.(tested with 
>>>>>>> TPC-DS)
>>>>>>>
>>>>>>> We are still working on more optimizations. Besides the performance,
>>>>>>> Volcano is continuously enhanced in below four directions to provide
>>>>>>> abilities that users care about.
>>>>>>> - Full lifecycle management for jobs
>>>>>>> - Scheduling policies for high-performance workloads(fair-share,
>>>>>>> topology, sla, reservation, preemption, backfill etc)
>>>>>>> - Support for heterogeneous hardware
>>>>>>> - Performance optimization for high-performance workloads
>>>>>>>
>>>>>>> Thanks
>>>>>>> LeiBo
>>>>>>>
>>>>>>> Mich Talebzadeh  于2022年1月4日周二 18:12写道:
>>>>>>>
>>>>>> Interesting,thanks
>>>>>>>>
>>>>>>>> Do you have any indication of the ballpark figure (a rough
>>>>>>>> numerical estimate) of adding Volcano as an alternative scheduler
>>>>>>>> is going to improve Spark on k8s performance?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>view my Linkedin profile
>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility
>>>>>>>> for any loss, damage or destruction
>>>>>>>>
>>>>>>>> of data or any other property which may arise from relying on this
>>>>>>>> email's technical content is explicitly disclaimed.
>>>>>>>>
>>>>>>>> The author will in no case be liable for any monetary damages
>>>>>>>> arising from such
>>>>>>>>
>>>>>>>> loss, damage or destruction.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, 4 Jan 2022 at 09:43, Yikun Jiang 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi, folks! Wishing you all the best in 2022.
>>>>>>>>>
>>>>>>>>> I'd like to share the current status on "Support Customized K8S
>>>>>>>>> Scheduler in Spark".
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> https://docs.google.com/document/d/1xgQGRpaHQX6-QH_J9YV2C2Dh6RpXefUpLM7KGkzL6Fg/edit#heading=h.1quyr1r2kr5n
>>>>>>>>>
>>>>>>>>> Framework/Common support
>>>>>>>>>
>>>>>>>>> - Volcano and Yunikorn team join the discussion and complete the
>>>>>>>>> initial doc on framework/common part.
>>>>>>>>>
>>>>>>>>> - SPARK-37145 <https://issues.apache.org/jira/browse/SPARK-37145>
>>>>>>>>> (under reviewing): We proposed to extend the customized scheduler by 
>>>>>>>>> just
>>>>>>>>> using a custom feature step, it will meet the requirement of 
>>>>>>>>> customized
>>>>>>>>> scheduler after it gets merged. After this, the user can enable 
>>>>>>>>> featurestep
>>>>>>>>> and scheduler like:
>>>>>>>>>
>>>>>>>>> spark-submit \
>>>>>>>>>
>>>>>>>>> --conf spark.kubernete.scheduler.name volcano \
>>>>>>>>>
>>>>>>>&g

Re: [DISCUSSION] SPIP: Support Volcano/Alternative Schedulers Proposal

2021-12-01 Thread Weiwei Yang
Thank you Yikun for the info, and thanks for inviting me to a meeting to
discuss this.
I appreciate your effort to put these together, and I agree that the
purpose is to make Spark easy/flexible enough to support other K8s
schedulers (not just for Volcano).
As discussed, could you please help to abstract out the things in common
and allow Spark to plug different implementations? I'd be happy to work
with you guys on this issue.


On Tue, Nov 30, 2021 at 6:49 PM Yikun Jiang  wrote:

> @Weiwei @Chenya
>
> > Thanks for bringing this up. This is quite interesting, we definitely
> should participate more in the discussions.
>
> Thanks for your reply and welcome to join the discussion, I think the
> input from Yunikorn is very critical.
>
> > The main thing here is, the Spark community should make Spark pluggable
> in order to support other schedulers, not just for Volcano. It looks like
> this proposal is pushing really hard for adopting PodGroup, which isn't
> part of K8s yet, that to me is problematic.
>
> Definitely yes, we are on the same page.
>
> I think we have the same goal: propose a general and reasonable mechanism
> to make spark on k8s with a custom scheduler more usable.
>
> But for the PodGroup, just allow me to do a brief introduction:
> - The PodGroup definition has been approved by Kubernetes officially in
> KEP-583. [1]
> - It can be regarded as a general concept/standard in Kubernetes rather
> than a specific concept in Volcano, there are also others to implement it,
> such as [2][3].
> - Kubernetes recommends using CRD to do more extension to implement what
> they want. [4]
> - Volcano as extension provides an interface to maintain the life cycle
> PodGroup CRD and use volcano-scheduler to complete the scheduling.
>
> [1]
> https://github.com/kubernetes/enhancements/tree/master/keps/sig-scheduling/583-coscheduling
> [2]
> https://github.com/kubernetes-sigs/scheduler-plugins/tree/master/pkg/coscheduling#podgroup
> [3] https://github.com/kubernetes-sigs/kube-batch
> [4]
> https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/
>
> Regards,
> Yikun
>
>
> Weiwei Yang  于2021年12月1日周三 上午5:57写道:
>
>> Hi Chenya
>>
>> Thanks for bringing this up. This is quite interesting, we definitely
>> should participate more in the discussions.
>> The main thing here is, the Spark community should make Spark pluggable
>> in order to support other schedulers, not just for Volcano. It looks like
>> this proposal is pushing really hard for adopting PodGroup, which isn't
>> part of K8s yet, that to me is problematic.
>>
>> On Tue, Nov 30, 2021 at 9:21 AM Prasad Paravatha <
>> prasad.parava...@gmail.com> wrote:
>>
>>> This is a great feature/idea.
>>> I'd love to get involved in some form (testing and/or documentation).
>>> This could be my 1st contribution to Spark!
>>>
>>> On Tue, Nov 30, 2021 at 10:46 PM John Zhuge  wrote:
>>>
>>>> +1 Kudos to Yikun and the community for starting the discussion!
>>>>
>>>> On Tue, Nov 30, 2021 at 8:47 AM Chenya Zhang <
>>>> chenyazhangche...@gmail.com> wrote:
>>>>
>>>>> Thanks folks for bringing up the topic of natively integrating Volcano
>>>>> and other alternative schedulers into Spark!
>>>>>
>>>>> +Weiwei, Wilfred, Chaoran. We would love to contribute to the
>>>>> discussion as well.
>>>>>
>>>>> From our side, we have been using and improving on one alternative
>>>>> resource scheduler, Apache YuniKorn (https://yunikorn.apache.org/),
>>>>> for Spark on Kubernetes in production at Apple with solid results in the
>>>>> past year. It is capable of supporting Gang scheduling (similar to
>>>>> PodGroups), multi-tenant resource queues (similar to YARN), FIFO, and 
>>>>> other
>>>>> handy features like bin packing to enable efficient autoscaling, etc.
>>>>>
>>>>> Natively integrating with Spark would provide more flexibility for
>>>>> users and reduce the extra cost and potential inconsistency of maintaining
>>>>> different layers of resource strategies. One interesting topic we hope to
>>>>> discuss more about is dynamic allocation, which would benefit from native
>>>>> coordination between Spark and resource schedulers in K8s &
>>>>> cloud environment for an optimal resource efficiency.
>>>>>
>>>>>
>>>>> On Tue, Nov 30, 2021 at 8:10 AM Holden Karau 

Re: [DISCUSSION] SPIP: Support Volcano/Alternative Schedulers Proposal

2021-11-30 Thread Weiwei Yang
IIUC, PodGroup is only supported in Volcano, this is not a common API
adopted by K8s, at least not today.
Spark needs to be agnostic about the schedulers, as an example, when we run
Spark on YARN, does Spark need to know if that is FairScheduler or
CapacityScheduler?
IMO, we should build things general enough in Spark in order to support
different schedulers, instead of having extra effort to support them one by
one.

On Tue, Nov 30, 2021 at 2:18 PM Mich Talebzadeh 
wrote:

> Hi,
>
> Well, in mitigation, one cannot address all the available scheduler
> options at once. Certainly PodGroup is an option, unless there are reasons
> to believe that this is not a right choice. Therefore, I stand corrected, I
> fail to see where problematic comes into it, unless you may care to
> elaborate your concerns.
>
>
> HTH
>
>
>view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Tue, 30 Nov 2021 at 21:57, Weiwei Yang  wrote:
>
>> Hi Chenya
>>
>> Thanks for bringing this up. This is quite interesting, we definitely
>> should participate more in the discussions.
>> The main thing here is, the Spark community should make Spark pluggable
>> in order to support other schedulers, not just for Volcano. It looks like
>> this proposal is pushing really hard for adopting PodGroup, which isn't
>> part of K8s yet, that to me is problematic.
>>
>> On Tue, Nov 30, 2021 at 9:21 AM Prasad Paravatha <
>> prasad.parava...@gmail.com> wrote:
>>
>>> This is a great feature/idea.
>>> I'd love to get involved in some form (testing and/or documentation).
>>> This could be my 1st contribution to Spark!
>>>
>>> On Tue, Nov 30, 2021 at 10:46 PM John Zhuge  wrote:
>>>
>>>> +1 Kudos to Yikun and the community for starting the discussion!
>>>>
>>>> On Tue, Nov 30, 2021 at 8:47 AM Chenya Zhang <
>>>> chenyazhangche...@gmail.com> wrote:
>>>>
>>>>> Thanks folks for bringing up the topic of natively integrating Volcano
>>>>> and other alternative schedulers into Spark!
>>>>>
>>>>> +Weiwei, Wilfred, Chaoran. We would love to contribute to the
>>>>> discussion as well.
>>>>>
>>>>> From our side, we have been using and improving on one alternative
>>>>> resource scheduler, Apache YuniKorn (https://yunikorn.apache.org/),
>>>>> for Spark on Kubernetes in production at Apple with solid results in the
>>>>> past year. It is capable of supporting Gang scheduling (similar to
>>>>> PodGroups), multi-tenant resource queues (similar to YARN), FIFO, and 
>>>>> other
>>>>> handy features like bin packing to enable efficient autoscaling, etc.
>>>>>
>>>>> Natively integrating with Spark would provide more flexibility for
>>>>> users and reduce the extra cost and potential inconsistency of maintaining
>>>>> different layers of resource strategies. One interesting topic we hope to
>>>>> discuss more about is dynamic allocation, which would benefit from native
>>>>> coordination between Spark and resource schedulers in K8s &
>>>>> cloud environment for an optimal resource efficiency.
>>>>>
>>>>>
>>>>> On Tue, Nov 30, 2021 at 8:10 AM Holden Karau 
>>>>> wrote:
>>>>>
>>>>>> Thanks for putting this together, I’m really excited for us to add
>>>>>> better batch scheduling integrations.
>>>>>>
>>>>>> On Tue, Nov 30, 2021 at 12:46 AM Yikun Jiang 
>>>>>> wrote:
>>>>>>
>>>>>>> Hey everyone,
>>>>>>>
>>>>>>> I'd like to start a discussion on "Support Volcano/Alternative
>>>>>>> Schedulers Proposal".
>>>>>>>
>>>>>>> This SPIP is proposed to make spark k8s schedulers provide more YARN
>>>>>>> like features (such as queues and minimum resources before scheduling 
>>>>>>> jobs)
>>>>>>> that many folks want on Kubernetes.
>>>>>>>
>>>>>>> The goal of this SPIP is to improve current spark k8s scheduler
>>>>>>> implementations, add the ability of batch scheduling and support 
>>>>>>> volcano as
>>>>>>> one of implementations.
>>>>>>>
>>>>>>> Design doc:
>>>>>>> https://docs.google.com/document/d/1xgQGRpaHQX6-QH_J9YV2C2Dh6RpXefUpLM7KGkzL6Fg
>>>>>>> JIRA: https://issues.apache.org/jira/browse/SPARK-36057
>>>>>>> Part of PRs:
>>>>>>> Ability to create resources
>>>>>>> https://github.com/apache/spark/pull/34599
>>>>>>> Add PodGroupFeatureStep: https://github.com/apache/spark/pull/34456
>>>>>>>
>>>>>>> Regards,
>>>>>>> Yikun
>>>>>>>
>>>>>> --
>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>
>>>>>
>>>>
>>>> --
>>>> John Zhuge
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Prasad Paravatha
>>>
>>


Re: [DISCUSSION] SPIP: Support Volcano/Alternative Schedulers Proposal

2021-11-30 Thread Weiwei Yang
Hi Chenya

Thanks for bringing this up. This is quite interesting, we definitely
should participate more in the discussions.
The main thing here is, the Spark community should make Spark pluggable in
order to support other schedulers, not just for Volcano. It looks like this
proposal is pushing really hard for adopting PodGroup, which isn't part of
K8s yet, that to me is problematic.

On Tue, Nov 30, 2021 at 9:21 AM Prasad Paravatha 
wrote:

> This is a great feature/idea.
> I'd love to get involved in some form (testing and/or documentation). This
> could be my 1st contribution to Spark!
>
> On Tue, Nov 30, 2021 at 10:46 PM John Zhuge  wrote:
>
>> +1 Kudos to Yikun and the community for starting the discussion!
>>
>> On Tue, Nov 30, 2021 at 8:47 AM Chenya Zhang 
>> wrote:
>>
>>> Thanks folks for bringing up the topic of natively integrating Volcano
>>> and other alternative schedulers into Spark!
>>>
>>> +Weiwei, Wilfred, Chaoran. We would love to contribute to the discussion
>>> as well.
>>>
>>> From our side, we have been using and improving on one alternative
>>> resource scheduler, Apache YuniKorn (https://yunikorn.apache.org/), for
>>> Spark on Kubernetes in production at Apple with solid results in the past
>>> year. It is capable of supporting Gang scheduling (similar to PodGroups),
>>> multi-tenant resource queues (similar to YARN), FIFO, and other handy
>>> features like bin packing to enable efficient autoscaling, etc.
>>>
>>> Natively integrating with Spark would provide more flexibility for users
>>> and reduce the extra cost and potential inconsistency of maintaining
>>> different layers of resource strategies. One interesting topic we hope to
>>> discuss more about is dynamic allocation, which would benefit from native
>>> coordination between Spark and resource schedulers in K8s &
>>> cloud environment for an optimal resource efficiency.
>>>
>>>
>>> On Tue, Nov 30, 2021 at 8:10 AM Holden Karau 
>>> wrote:
>>>
 Thanks for putting this together, I’m really excited for us to add
 better batch scheduling integrations.

 On Tue, Nov 30, 2021 at 12:46 AM Yikun Jiang 
 wrote:

> Hey everyone,
>
> I'd like to start a discussion on "Support Volcano/Alternative
> Schedulers Proposal".
>
> This SPIP is proposed to make spark k8s schedulers provide more YARN
> like features (such as queues and minimum resources before scheduling 
> jobs)
> that many folks want on Kubernetes.
>
> The goal of this SPIP is to improve current spark k8s scheduler
> implementations, add the ability of batch scheduling and support volcano 
> as
> one of implementations.
>
> Design doc:
> https://docs.google.com/document/d/1xgQGRpaHQX6-QH_J9YV2C2Dh6RpXefUpLM7KGkzL6Fg
> JIRA: https://issues.apache.org/jira/browse/SPARK-36057
> Part of PRs:
> Ability to create resources https://github.com/apache/spark/pull/34599
> Add PodGroupFeatureStep: https://github.com/apache/spark/pull/34456
>
> Regards,
> Yikun
>
 --
 Twitter: https://twitter.com/holdenkarau
 Books (Learning Spark, High Performance Spark, etc.):
 https://amzn.to/2MaRAG9  
 YouTube Live Streams: https://www.youtube.com/user/holdenkarau

>>>
>>
>> --
>> John Zhuge
>>
>
>
> --
> Regards,
> Prasad Paravatha
>