resending cc dev for record - sorry forgot to reply all earlier :)

For 1 - I'm more leaning towards 'official' as this aims to provide Spark
users a community-recommended way to automate and manage Spark deployments
on k8s. It does not mean the current / other options would become
off-standard from my point of view.

For 2/3 - as the operator starts driver pods in the same way as
spark-submit, I would not expect start-up time to be significantly reduced
by using the operator. However there are indeed some optimizations we can
do in practice. For example, with operator we can enable users to separate
the application packaging from Spark: use an init container to load Spark
binary, and apply application jar / packages on top of that in a
different container. The benefit is - application image or package would be
relatively lean and therefore, taking less time to upload to registry or to
download onto nodes. Spark images could be relatively static (e.g. use the
official docker images <https://github.com/apache/spark-docker> ) and hence
can be cached on nodes. There are more technical details that can be
discussed in the upcoming design doc if we agree to proceed with the
operator proposal.

On Fri, Nov 10, 2023 at 8:11 AM Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Hi,
>
> Looks like a good idea but before committing myself, I have a number of
> design questions having looked at SPIP itself:
>
>
>    1. Will the name "Standard add-on Kubernetes operator to Spark ''
>    describe it better?
>    2. We  are still struggling with improving Spark driver start-up time.
>    What would be the footprint of this add-on on the driver start-up time?
>    3. In  a commercial world will there be (?) a static image for this
>    besides the base image that is maintained in the so called  container
>    registry (ECR, GCR etc), It takes time to upload these images. Will this
>    bea  static image (docker file)? Other alternative would be that this
>    docker file is created by the user through set of scripts?
>
>
> These are the things that come into my mind.
>
> HTH
>
>
> Mich Talebzadeh,
> Distinguished Technologist, Solutions Architect & Engineer
> London
> United Kingdom
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Fri, 10 Nov 2023 at 14:19, Bjørn Jørgensen <bjornjorgen...@gmail.com>
> wrote:
>
>> +1
>>
>> fre. 10. nov. 2023 kl. 08:39 skrev Nan Zhu <zhunanmcg...@gmail.com>:
>>
>>> just curious what happened on google’s spark operator?
>>>
>>> On Thu, Nov 9, 2023 at 19:12 Ilan Filonenko <i...@cornell.edu> wrote:
>>>
>>>> +1
>>>>
>>>> On Thu, Nov 9, 2023 at 7:43 PM Ryan Blue <b...@tabular.io> wrote:
>>>>
>>>>> +1
>>>>>
>>>>> On Thu, Nov 9, 2023 at 4:23 PM Hussein Awala <huss...@awala.fr> wrote:
>>>>>
>>>>>> +1 for creating an official Kubernetes operator for Apache Spark
>>>>>>
>>>>>> On Fri, Nov 10, 2023 at 12:38 AM huaxin gao <huaxin.ga...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> +1
>>>>>>>
>>>>>>
>>>>>>> On Thu, Nov 9, 2023 at 3:14 PM DB Tsai <dbt...@dbtsai.com> wrote:
>>>>>>>
>>>>>>>> +1
>>>>>>>>
>>>>>>>> To be completely transparent, I am employed in the same department
>>>>>>>> as Zhou at Apple.
>>>>>>>>
>>>>>>>> I support this proposal, provided that we witness community
>>>>>>>> adoption following the release of the Flink Kubernetes operator,
>>>>>>>> streamlining Flink deployment on Kubernetes.
>>>>>>>>
>>>>>>>> A well-maintained official Spark Kubernetes operator is essential
>>>>>>>> for our Spark community as well.
>>>>>>>>
>>>>>>>> DB Tsai  |  https://www.dbtsai.com/
>>>>>>>> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.dbtsai.com%2F&data=05%7C01%7Cif56%40g.cornell.edu%7C6b33babc19c64437ef0408dbe18607c6%7C5d7e43661b9b45cf8e79b14b27df46e1%7C0%7C0%7C638351737993352064%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=uZSpzGB3TcMkiB4aGlteedWlk%2FL3M8XgHfcFxasEGUk%3D&reserved=0>
>>>>>>>>  |  PGP 42E5B25A8F7A82C1
>>>>>>>>
>>>>>>>> On Nov 9, 2023, at 12:05 PM, Zhou Jiang <zhou.c.ji...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi Spark community,
>>>>>>>> I'm reaching out to initiate a conversation about the possibility
>>>>>>>> of developing a Java-based Kubernetes operator for Apache Spark. 
>>>>>>>> Following
>>>>>>>> the operator pattern (
>>>>>>>> https://kubernetes.io/docs/concepts/extend-kubernetes/operator/
>>>>>>>> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkubernetes.io%2Fdocs%2Fconcepts%2Fextend-kubernetes%2Foperator%2F&data=05%7C01%7Cif56%40g.cornell.edu%7C6b33babc19c64437ef0408dbe18607c6%7C5d7e43661b9b45cf8e79b14b27df46e1%7C0%7C0%7C638351737993352064%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Np4pJPeJNqKLEJWsH5PrGQ%2FxbcbQXs6lk8i5pCgMkaE%3D&reserved=0>),
>>>>>>>> Spark users may manage applications and related components seamlessly 
>>>>>>>> using
>>>>>>>> native tools like kubectl. The primary goal is to simplify the Spark 
>>>>>>>> user
>>>>>>>> experience on Kubernetes, minimizing the learning curve and operational
>>>>>>>> complexities and therefore enable users to focus on the Spark 
>>>>>>>> application
>>>>>>>> development.
>>>>>>>> Although there are several open-source Spark on Kubernetes
>>>>>>>> operators available, none of them are officially integrated into the 
>>>>>>>> Apache
>>>>>>>> Spark project. As a result, these operators may lack active support and
>>>>>>>> development for new features. Within this proposal, our aim is to 
>>>>>>>> introduce
>>>>>>>> a Java-based Spark operator as an integral component of the Apache 
>>>>>>>> Spark
>>>>>>>> project. This solution has been employed internally at Apple for 
>>>>>>>> multiple
>>>>>>>> years, operating millions of executors in real production 
>>>>>>>> environments. The
>>>>>>>> use of Java in this solution is intended to accommodate a wider user 
>>>>>>>> and
>>>>>>>> contributor audience, especially those who are familiar with Scala.
>>>>>>>> Ideally, this operator should have its dedicated repository,
>>>>>>>> similar to Spark Connect Golang or Spark Docker, allowing it to 
>>>>>>>> maintain a
>>>>>>>> loose connection with the Spark release cycle. This model is also 
>>>>>>>> followed
>>>>>>>> by the Apache Flink Kubernetes operator.
>>>>>>>> We believe that this project holds the potential to evolve into a
>>>>>>>> thriving community project over the long run. A comparison can be drawn
>>>>>>>> with the Flink Kubernetes Operator: Apple has open-sourced internal 
>>>>>>>> Flink
>>>>>>>> Kubernetes operator, making it a part of the Apache Flink project (
>>>>>>>> https://github.com/apache/flink-kubernetes-operator
>>>>>>>> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fflink-kubernetes-operator&data=05%7C01%7Cif56%40g.cornell.edu%7C6b33babc19c64437ef0408dbe18607c6%7C5d7e43661b9b45cf8e79b14b27df46e1%7C0%7C0%7C638351737993352064%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=jltCb10Ws2CxEHh4%2FF%2Big96Tt8U1UCEZlmhAuWRxx9Y%3D&reserved=0>).
>>>>>>>> This move has gained wide industry adoption and contributions from the
>>>>>>>> community. In a mere year, the Flink operator has garnered more than 
>>>>>>>> 600
>>>>>>>> stars and has attracted contributions from over 80 contributors. This
>>>>>>>> showcases the level of community interest and collaborative momentum 
>>>>>>>> that
>>>>>>>> can be achieved in similar scenarios.
>>>>>>>> More details can be found at SPIP doc : Spark Kubernetes Operator
>>>>>>>> https://docs.google.com/document/d/1f5mm9VpSKeWC72Y9IiKN2jbBn32rHxjWKUfLRaGEcLE
>>>>>>>> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.google.com%2Fdocument%2Fd%2F1f5mm9VpSKeWC72Y9IiKN2jbBn32rHxjWKUfLRaGEcLE&data=05%7C01%7Cif56%40g.cornell.edu%7C6b33babc19c64437ef0408dbe18607c6%7C5d7e43661b9b45cf8e79b14b27df46e1%7C0%7C0%7C638351737993352064%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=w8FrIp88nEpI7lXCBy7Y2U9NZ0uy%2B2Bssu7wjFqZCFw%3D&reserved=0>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> --
>>>>>>>> *Zhou JIANG*
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>
>>>>> --
>>>>> Ryan Blue
>>>>> Tabular
>>>>>
>>>>
>>
>> --
>> Bjørn Jørgensen
>> Vestre Aspehaug 4, 6010 Ålesund
>> Norge
>>
>> +47 480 94 297
>>
>

-- 
*Zhou JIANG*

Reply via email to