+1 from another user fwiw. We also have livy containers and helm charts. The 
real problem is deploying a Spark Cluster in k8s. We know of no working images 
for this. The Spark team seems focused on deploying Jobs with k8s, which is 
fine but is not enough. We need to deploy Spark itself.  We created our own 
containers and charts for this too.

Is anyone interested in sharing images that work with k8s for Livy and/or 
Spark? Ours are all ASF licensed OSS.

From: Marco Gaido <marcogaid...@gmail.com>
Reply: dev@livy.incubator.apache.org <dev@livy.incubator.apache.org>
Date: January 14, 2020 at 2:35:34 PM
To: dev@livy.incubator.apache.org <dev@livy.incubator.apache.org>
Subject:  Re: Livy on Kubernetes support  

Hi Aliaksandr,  

thanks for your email and you work on this feature. As I mentioned to you  
in the PR, I agree with you on the usefulness of this feature and you have  
a big +1 from me having it in Livy. Unfortunately, it is not my area of  
expertise, so I don't feel confident merging it without other reviewers  
taking a careful look at it.  

For the future, I think a better approach would be first to discuss and  
define the architecture with the community, so that it is shared and  
accepted by the whole community before the PR is out. This helps also  
getting people involved and makes easier having them being able to review  
the PR. Anyway, after you have split the PRs, I think they are reasonable  
and we can discuss on them.  

Looking forward to have your contribution in Livy.  

Thanks,  
Marco  

Il giorno mar 14 gen 2020 alle ore 12:48 Aliaksandr Sasnouskikh <  
jahstreetl...@gmail.com> ha scritto:  

> Hi community,  
>  
> About a year ago I've started to work on the patch to Apache Livy for Spark  
> on Kubernetes support in the scope of the project I've been working on.  
> Since that time I've created a PR  
> https://github.com/apache/incubator-livy/pull/167 which have already been  
> discussed and reviewed a lot. After finalizing the work in the result of  
> the PR discussions I've started to split the work introduced in the base PR  
> into smaller pieces to make it easier to separate the core and aux  
> functionality, and as a result - easier to review and merge. The first core  
> PR is https://github.com/apache/incubator-livy/pull/249.  
>  
> Also I've created the repos with Docker images (  
> https://github.com/jahstreet/spark-on-kubernetes-docker) and Helm charts (  
> https://github.com/jahstreet/spark-on-kubernetes-helm) with the possible  
> stack the users may want to use Livy on Kubernetes with, which potentially  
> in the future can be partially moved to Livy repo to keep the artifacts  
> required to run Livy on Kubernetes in a single place.  
>  
> Until now I've received the positive feedback from more than 10 projects  
> about the usage of the patch. Several of them could be found in the  
> discussions of the base PR. Also my repos supporting this feature have  
> around 35 stars and 15 forks in total and were referenced in Spark related  
> Stackoverflow and Kubernetes slack channel discussions. So the users use it  
> already.  
>  
> You may think "What this guy wants from us then!?"... Well, I would like to  
> ask for your time and expertise to help push it forward and ideally make it  
> merged.  
>  
> Probably before I started coding I should have checked with the  
> contributors if this feature may have value for the project and how is  
> better to implement it, but I hope it is never too late;) So I'm here to  
> share with you the the thoughts behind it.  
>  
> The idea of Livy on Kubernetes is simply to replicate the logic it has for  
> Yarn API to Kubernetes API, which can be easily done since the interfaces  
> for the Yarn API are really similar to the ones of the Kubernetes.  
> Nevertheless this easy-to-do patch opens Livy the doors to Kubernetes which  
> seems to be really useful for the community taking into account the  
> popularity of Kubernetes itself and the latest releases of Spark supporting  
> Kubernetes as well.  
>  
> Proposed Livy job submission flow:  
> - Generate appTag and add  
> `spark.kubernetes.[driver/executor].label.SPARK_APP_TAG_LABEL` to Spark  
> config  
> - Run spark-submit in cluster-mode with Kubernetes master  
> - Start monitoring thread which resolves Spark Driver and Executor Pods  
> using the `SPARK_APP_TAG_LABEL`s assigned during the job submission  
> - Create additional Kubernetes resource if necessary: Spark UI service,  
> Ingress, CRDs, etc.  
> - Fetch Spark Pods statuses, Driver logs and other diagnostics information  
> while Spark Pods are running  
> - Remove Spark job resources (completed/failed Driver Pod, Service,  
> ConfigMap, etc.) from the cluster after the job completion/failure after  
> the configured timeout  
>  
> The core functionality (covered by  
> https://github.com/apache/incubator-livy/pull/249):  
> - Submission of Batch jobs and Interactive sessions  
> - Caching Driver logs and Kubernetes Pods diagnostics  
>  
> Aux features (introduced in  
> https://github.com/apache/incubator-livy/pull/167 and planned to be pushed  
> after the core PR is merged):  
> - Proxying Spark UI with Kubernetes Ingress'es and providing the access to  
> it with the link on the Livy UI while the job is running (Nginx ingress as  
> an example)  
> - Proxying Spark History Server with Kubernetes Ingress'es and providing  
> the access to it with the link on the Livy UI after the job  
> completion/failure (Nginx ingress as an example)  
>  
> Misc:  
> - Add documentation to the Livy web-site and the guide on how to run Livy  
> on Kubernetes with the possible config options  
> - Add Docker manifests and Helm charts  
>  
> Having Livy running on Kubernetes we do get not only the possibility to run  
> Spark on Kubernetes with the REST API but also get out of the box  
> integrations:  
> - Jupyter notebooks (with JupyterHub on Kubernetes) through Sparkmagic  
> kernel  
> - Extensive monitoring with Prometheus and Grafana stack  
>  
> Some more ideas could be found in the readme here:  
> https://github.com/jahstreet/spark-on-kubernetes-helm.  
>  
> I would be thankful to everyone paying some time to review the feature I  
> propose and appreciate any help or advices on the right steps towards  
> making the feature released.  
>  
> Kind regards,  
> Aliaksandr Sasnouskikh  
> https://github.com/jahstreet/  
> https://www.linkedin.com/in/jahstreet/  
> +31 (0) 6 51 29 58 39  
>  

Reply via email to