Re: Spark on Kubernetes scheduler variety

2021-07-08 Thread Mich Talebzadeh
Splendid. Please invite me to the next meeting mich.talebza...@gmail.com Timezone London, UK *GMT+1* Thanks, view my Linkedin profile *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or des

Re: Spark on Kubernetes scheduler variety

2021-07-08 Thread Holden Karau
Hi Y'all, We had an initial meeting which went well, got some more context around Volcano and its near-term roadmap. Talked about the impact around scheduler deadlocking and some ways that we could potentially improve integration from the Spark side and Volcano sides respectively. I'm going to sta

Re: Spark on Kubernetes scheduler variety

2021-07-01 Thread Mich Talebzadeh
Thanks. I also have a three node cluster in my lab running Red Hat 7.6 with 64GB of RAM etc. However, I doubt whether minikube will be useful. If we can get a Google Kubernetes Engine (GKE) cluster (which is a fully managed service) from Google on a loan

Re: Spark on Kubernetes scheduler variety

2021-07-01 Thread Holden Karau
I do my own dev work on a personal cluster I have down in Fremont which I’ve got setup using k3sup. I know some devs use minikube (and our integration tests can). But yeah if there was a vendor willing to hand out Kube resources that could simplify our dev cycles. On Thu, Jul 1, 2021 at 12:52 PM M

Re: Spark on Kubernetes scheduler variety

2021-07-01 Thread Mich Talebzadeh
Hi, A rather simple question. As Kubernetes is a special work requiring some effort in setting it up properly, do we have a dev/test bed to conduct development work? What I am trying to get at is if there is official support for Volcano stuff that a vendor can provide free cluster usage in excha

Re: Spark on Kubernetes scheduler variety

2021-06-30 Thread Mich Talebzadeh
Hi Klaus, Thanks https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/1289 view my Linkedin profile *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or

Re: Spark on Kubernetes scheduler variety

2021-06-30 Thread Klaus Ma
Hi Mich, Would you help to open an issue at spark-on-k8s-operator repo? We're going to submit a PR to update the install steps :) -- Klaus On Wed, Jun 30, 2021 at 12:24 AM Mich Talebzadeh wrote: > Hi Yikun > > In reference > > > https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob

Re: Spark on Kubernetes scheduler variety

2021-06-30 Thread Mich Talebzadeh
Hi Michel, Thanks for the link. I am familiar with G-Research as I met them in my presentation in London back in October 2019. The amanda project sems to create super-scheduling on top of Kubernetes clusters and I quote: "Armada is an application to achieve high throughput of run-to-completion

Re: Spark on Kubernetes scheduler variety

2021-06-29 Thread Mich Talebzadeh
Hi Yikun In reference https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/docs/volcano-integration.md Trying to install Volcano I am getting this error helm repo add incubator http://storage.googleapis.com/kubernetes-charts-incubator Error: looks like "http://storage.google

Re: Spark on Kubernetes scheduler variety

2021-06-29 Thread Mich Talebzadeh
Cool, thanks! view my Linkedin profile *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical conte

Re: Spark on Kubernetes scheduler variety

2021-06-28 Thread Yikun Jiang
> Is this the correct link for integrating Volcano with Spark? Yes, it is Kubernetes operator style of integrating Volcano. And if you want to just use spark submit style to submit a native support job, you can see [2] as ref. [1] https://github.com/huawei-cloudnative/spark/commit/6c1f37525f02635

Re: Spark on Kubernetes scheduler variety

2021-06-28 Thread Mich Talebzadeh
Hi Yikun, Is this the correct link for integrating Volcano with Spark? spark-on-k8s-operator/volcano-integration.md at master · GoogleCloudPlatform/spark-on-k8s-operator · GitHub Thanks Mich

Re: Spark on Kubernetes scheduler variety

2021-06-25 Thread Yikun Jiang
Oops, sorry for the error link, it should be: We will also prepare to propose an initial design and POC[3] on a shared branch (based on spark master branch) where we can collaborate on it, so I created the spark-volcano[1] org in github to make it happen. [3] https://github.com/huawei-cloudnative

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread John Zhuge
Thanks Yikun! On Thu, Jun 24, 2021 at 8:54 PM Yikun Jiang wrote: > Hi, folks. > > As @Klaus mentioned, We have some work on Spark on k8s with volcano native > support. Also, there were also some production deployment validation from > our partners in China, like JingDong, XiaoHongShu, VIPshop. >

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Yikun Jiang
Hi, folks. As @Klaus mentioned, We have some work on Spark on k8s with volcano native support. Also, there were also some production deployment validation from our partners in China, like JingDong, XiaoHongShu, VIPshop. We will also prepare to propose an initial design and POC[3] on a shared bran

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Mich Talebzadeh
Hi Holden, Thank you for your points. I guess coming from a corporate world I had an oversight on how an open source project like Spark does leverage resources and interest :). As @KlausMa kindly volunteered it would be good to hear scheduling ideas on Spark on Kubernetes and of course as I am su

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Holden Karau
Hi Mich, I certainly think making Spark on Kubernetes run well is going to be a challenge. However I think, and I could be wrong about this as well, that in terms of cluster managers Kubernetes is likely to be our future. Talking with people I don't hear about new standalone, YARN or mesos deploym

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Holden Karau
That's awesome, I'm just starting to get context around Volcano but maybe we can schedule an initial meeting for all of us interested in pursuing this to get on the same page. On Wed, Jun 23, 2021 at 6:54 PM Klaus Ma wrote: > Hi team, > > I'm kube-batch/Volcano founder, and I'm excited to hear t

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread John Zhuge
Thanks Klaus! I am interested in more details. On Wed, Jun 23, 2021 at 6:54 PM Klaus Ma wrote: > Hi team, > > I'm kube-batch/Volcano founder, and I'm excited to hear that the spark > community also has such requirements :) > > Volcano provides several features for batch workload, e.g. fair-share

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Mich Talebzadeh
Thanks Klaus. That will be great. It will also be intuitive if you elaborate the need for this feature in line with the limitation of the current batch workload. Regards, Mich view my Linkedin profile *Disclaimer:* Use it at you

Re: Spark on Kubernetes scheduler variety

2021-06-23 Thread Klaus Ma
Hi team, I'm kube-batch/Volcano founder, and I'm excited to hear that the spark community also has such requirements :) Volcano provides several features for batch workload, e.g. fair-share, queue, reservation, preemption/reclaim and so on. It has been used in several product environments with Sp

Re: Spark on Kubernetes scheduler variety

2021-06-23 Thread Mich Talebzadeh
Please allow me to be diverse and express a different point of view on this roadmap. I believe from a technical point of view spending time and effort plus talent on batch scheduling on Kubernetes could be rewarding. However, if I may say I doubt whether such an approach and the so-called democra

Re: Spark on Kubernetes scheduler variety

2021-06-18 Thread Holden Karau
I think these approaches are good, but there are limitations (eg dynamic scaling) without us making changes inside of the Spark Kube scheduler. Certainly whichever scheduler extensions we add support for we should collaborate with the people developing those extensions insofar as they are interest

Re: Spark on Kubernetes scheduler variety

2021-06-18 Thread Mich Talebzadeh
Hi, Regarding your point and I quote ".. I know that one of the Spark on Kube operators supports volcano/kube-batch so I was thinking that might be a place I would start exploring..." There seems to be ongoing work on say Volcano as part of Cloud Native Computing Foundation

Spark on Kubernetes scheduler variety

2021-06-17 Thread Holden Karau
Hi Folks, I'm continuing my adventures to make Spark on containers party and I was wondering if folks have experience with the different batch scheduler options that they prefer? I was thinking so that we can better support dynamic allocation it might make sense for us to support using different s