Hi All,

First thanks to Holden for organising this open discussion and exchange of
ideas. I must apologize for problems with my microphone. Hopefully it
should not happen again..

>From my own commercial experience  with k8s, mainly Google GKE, there is a
main concern that Spark on GKE is work in progress and not on-par with
Spark on hadoop/yarn example Spark on Google Dataproc. I don't think this
statement is longer true as Spark on K8s has since matured. Albeit the
performance is not 100% there. The commercial motivation for Spark on K9s
is reduction in cost. The assumption is that  it would be cheaper to run
Spark on GKE without Dataproc. Not to forget that there are other non-spark
applications and datatores running on K8s/GKE. So it makes sense to improve
Spark performance on K8s. Another motivation is to break down monolithic
applications into microservices and from the point of ETL/ELT Spark plays a
considerable role.

For those who are still using Spark on Hadoop/yarn, if I recall Colin
mentioned, Google has thought that Spark is important enough to allow the
migration path from Spark on dataproc to  Run a Spark job on Dataproc on
Google Kubernetes Engine. I am not sure other Cloud vendors have been
through this journey. Maybe some members will clarify this.
<https://cloud.google.com/dataproc/docs/guides/dpgke/quickstarts/dataproc-gke-quickstart-create-cluster>

With regard to authentication there is this workload identity that has
replaced the clumsy secrets file that compromised security and was
available to all nodes of K8s cluster. I am not sure how Spark can
integrate with workload identity. The authentication is at the pod level
rather than node level.
<https://cloud.google.com/dataproc/docs/guides/dpgke/quickstarts/dataproc-gke-quickstart-create-cluster>
<https://cloud.google.com/dataproc/docs/guides/dpgke/quickstarts/dataproc-gke-quickstart-create-cluster>


Thanks


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Mon, 13 Feb 2023 at 08:24, Holden Karau <hol...@pigscanfly.ca> wrote:

> Some general issues we found common ground around:
>
> Inter-Pod security, istio + mTLS
> Sidecar management
> Docker Images
> Add links to more related images
> - Helm links
> Data Locality concerns
> Upgrading  Spark Versions
> Performance issues
>
> Thanks to everyone who was able to make the informal coffee chat
>
> I'll try and schedule another one at a more European friendly time so that
> we can all get to chat as well.
>
> On Fri, Feb 10, 2023 at 1:08 PM Mich Talebzadeh <mich.talebza...@gmail.com>
> wrote:
>
>> Great looking forward to it
>>
>> Mich
>>
>>
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>>
>> On Fri, 10 Feb 2023 at 18:58, Holden Karau <hol...@pigscanfly.ca> wrote:
>>
>>> Ok so the first iteration of this is booked:
>>>
>>>
>>> Spark on Kube Coffee Chats
>>> Sunday, Feb 12 · 6–7 PM pacific time
>>> Google Meet joining info
>>> Video call link: https://meet.google.com/wge-tzzd-uyj
>>>
>>> Assuming that all goes well I’ll send out another doodle pole after this
>>> one for the folks who could not make this one.
>>>
>>> Looking forward to catching up with y’all :) No prep work necessary but
>>> if anyone wants to write down a brief like two sentence blurb about their
>>> goals for Spark on Kube was thinking we might go around the virtual room
>>> sharing that as our kicking off point for this coffee meeting :)
>>>
>>>
>>> On Wed, Feb 8, 2023 at 12:27 PM Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>>
>>>> That sounds like a good plan Holden!
>>>>
>>>>
>>>> Let us go for it
>>>>
>>>>
>>>>    view my Linkedin profile
>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>
>>>>
>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>
>>>>
>>>>
>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>>> any loss, damage or destruction of data or any other property which may
>>>> arise from relying on this email's technical content is explicitly
>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>> arising from such loss, damage or destruction.
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, 8 Feb 2023 at 20:12, Holden Karau <hol...@pigscanfly.ca> wrote:
>>>>
>>>>> My thought here was that it's more focused on getting to understand
>>>>> each other's goals / priorities and less solving any specific problem.
>>>>>
>>>>> For example, I know that some folks running on EKS have different
>>>>> priorities than folks running on-prem.
>>>>>
>>>>> We might (later on) make a roadmap doc if that seems necessary, but
>>>>> I'm hoping that just an understanding of folks priorities and challenges
>>>>> will make it easier for us to all collaborate.
>>>>>
>>>>> On Wed, Feb 8, 2023 at 11:47 AM Mich Talebzadeh <
>>>>> mich.talebza...@gmail.com> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> Is this going to be a brainstorming meeting or there will be a prior
>>>>>> agenda to work around it?
>>>>>>
>>>>>> thanks
>>>>>>
>>>>>>
>>>>>>
>>>>>>    view my Linkedin profile
>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>
>>>>>>
>>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>>
>>>>>>
>>>>>>
>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility
>>>>>> for any loss, damage or destruction of data or any other property which 
>>>>>> may
>>>>>> arise from relying on this email's technical content is explicitly
>>>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>>>> arising from such loss, damage or destruction.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, 8 Feb 2023 at 18:33, Mich Talebzadeh <
>>>>>> mich.talebza...@gmail.com> wrote:
>>>>>>
>>>>>>> Ok Colin thanks for clarification
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>    view my Linkedin profile
>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>>
>>>>>>>
>>>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility
>>>>>>> for any loss, damage or destruction of data or any other property which 
>>>>>>> may
>>>>>>> arise from relying on this email's technical content is explicitly
>>>>>>> disclaimed. The author will in no case be liable for any monetary 
>>>>>>> damages
>>>>>>> arising from such loss, damage or destruction.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, 8 Feb 2023 at 18:08, Colin Williams <
>>>>>>> colin.williams.seat...@gmail.com> wrote:
>>>>>>>
>>>>>>>> I'm sorry you misunderstood.  The context is migrating jobs to
>>>>>>>> Spark on k8s.
>>>>>>>>
>>>>>>>> On Wed, Feb 8, 2023, 8:31 AM Mich Talebzadeh <
>>>>>>>> mich.talebza...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Colin,
>>>>>>>>>
>>>>>>>>> Thanks for your reply.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I think both Yarn and Kubernetes are cluster managers plus
>>>>>>>>> Standalone and Remotely Mesos. So I gather this discussion will focus 
>>>>>>>>> on
>>>>>>>>> Spark on k8s unless I am mistaken.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> HTH,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Mich
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    view my Linkedin profile
>>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility
>>>>>>>>> for any loss, damage or destruction of data or any other property 
>>>>>>>>> which may
>>>>>>>>> arise from relying on this email's technical content is explicitly
>>>>>>>>> disclaimed. The author will in no case be liable for any monetary 
>>>>>>>>> damages
>>>>>>>>> arising from such loss, damage or destruction.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, 8 Feb 2023 at 16:02, Colin Williams <
>>>>>>>>> colin.williams.seat...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Cloud or non cloud. Yarn, etc.
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 8, 2023, 7:57 AM Mich Talebzadeh <
>>>>>>>>>> mich.talebza...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Colin,
>>>>>>>>>>>
>>>>>>>>>>> Environments meaning different (cloud)  vendors or hosts?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>    view my Linkedin profile
>>>>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *Disclaimer:* Use it at your own risk. Any and all
>>>>>>>>>>> responsibility for any loss, damage or destruction of data or any 
>>>>>>>>>>> other
>>>>>>>>>>> property which may arise from relying on this email's technical 
>>>>>>>>>>> content is
>>>>>>>>>>> explicitly disclaimed. The author will in no case be liable for any
>>>>>>>>>>> monetary damages arising from such loss, damage or destruction.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, 8 Feb 2023 at 03:42, Colin Williams <
>>>>>>>>>>> colin.williams.seat...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I wouldn't mind attending or viewing a recording depending on
>>>>>>>>>>>> availability. I'm interested in challenges and solutions to 
>>>>>>>>>>>> porting Spark
>>>>>>>>>>>> jobs between environments.
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Feb 7, 2023 at 7:34 PM Denis Bolshakov <
>>>>>>>>>>>> bolshakov.de...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am also interested, please add me to the conf.
>>>>>>>>>>>>>
>>>>>>>>>>>>> ср, 8 февр. 2023 г., 07:21 Jayabindu Singh <
>>>>>>>>>>>>> jayabi...@gmail.com>:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Greetings everyone!
>>>>>>>>>>>>>> I am super new to this group and currently leading some work
>>>>>>>>>>>>>> to deploy spark on k8 for my company o9 Solutions.
>>>>>>>>>>>>>> I would love to join the discussion.
>>>>>>>>>>>>>> I am in PST.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>> Jay
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sent from my iPhone
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Feb 7, 2023, at 3:57 PM, Mich Talebzadeh <
>>>>>>>>>>>>>> mich.talebza...@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Could be interesting. Need to summarise where we are with
>>>>>>>>>>>>>> Spark on k8s and what market demands.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> My personal experience with Volcano was not that
>>>>>>>>>>>>>> impressive 🤔. So may be a summary will do.where we are currently 
>>>>>>>>>>>>>> with
>>>>>>>>>>>>>> spark on k8s
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I am on  Greenwich Mean Time but I can take part in late
>>>>>>>>>>>>>> sessions if needed
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> HTH
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>    view my Linkedin profile
>>>>>>>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *Disclaimer:* Use it at your own risk. Any and all
>>>>>>>>>>>>>> responsibility for any loss, damage or destruction of data or 
>>>>>>>>>>>>>> any other
>>>>>>>>>>>>>> property which may arise from relying on this email's technical 
>>>>>>>>>>>>>> content is
>>>>>>>>>>>>>> explicitly disclaimed. The author will in no case be liable for 
>>>>>>>>>>>>>> any
>>>>>>>>>>>>>> monetary damages arising from such loss, damage or destruction.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, 7 Feb 2023 at 23:37, John Zhuge <jzh...@apache.org>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Awesome, count me in!
>>>>>>>>>>>>>>> PST
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Feb 7, 2023 at 3:34 PM Andrew Melo <
>>>>>>>>>>>>>>> andrew.m...@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm Central US time (AKA UTC -6:00)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Feb 7, 2023 at 5:32 PM Holden Karau <
>>>>>>>>>>>>>>>> hol...@pigscanfly.ca> wrote:
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > Awesome, I guess I should have asked folks for timezones
>>>>>>>>>>>>>>>> that they’re in.
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > On Tue, Feb 7, 2023 at 3:30 PM Andrew Melo <
>>>>>>>>>>>>>>>> andrew.m...@gmail.com> wrote:
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> Hello Holden,
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> We are interested in Spark on k8s and would like the
>>>>>>>>>>>>>>>> opportunity to
>>>>>>>>>>>>>>>> >> speak with devs about what we're looking for slash
>>>>>>>>>>>>>>>> better ways to use
>>>>>>>>>>>>>>>> >> spark.
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> Thanks!
>>>>>>>>>>>>>>>> >> Andrew
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> On Tue, Feb 7, 2023 at 5:24 PM Holden Karau <
>>>>>>>>>>>>>>>> hol...@pigscanfly.ca> wrote:
>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>> >> > Hi Folks,
>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>> >> > It seems like we could maybe use some additional
>>>>>>>>>>>>>>>> shared context around Spark on Kube so I’d like to try and 
>>>>>>>>>>>>>>>> schedule a
>>>>>>>>>>>>>>>> virtual coffee session.
>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>> >> > Who all would be interested in virtual adventures
>>>>>>>>>>>>>>>> around Spark on Kube development?
>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>> >> > No pressure if the idea of hanging out in a virtual
>>>>>>>>>>>>>>>> chat with coffee and Spark devs does not sound like your 
>>>>>>>>>>>>>>>> thing, just trying
>>>>>>>>>>>>>>>> to make something informal so we can have a better 
>>>>>>>>>>>>>>>> understanding of
>>>>>>>>>>>>>>>> everyone’s goals here.
>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>> >> > Cheers,
>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>> >> > Holden :)
>>>>>>>>>>>>>>>> >> > --
>>>>>>>>>>>>>>>> >> > Twitter: https://twitter.com/holdenkarau
>>>>>>>>>>>>>>>> >> > Books (Learning Spark, High Performance Spark, etc.):
>>>>>>>>>>>>>>>> https://amzn.to/2MaRAG9
>>>>>>>>>>>>>>>> >> > YouTube Live Streams:
>>>>>>>>>>>>>>>> https://www.youtube.com/user/holdenkarau
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > --
>>>>>>>>>>>>>>>> > Twitter: https://twitter.com/holdenkarau
>>>>>>>>>>>>>>>> > Books (Learning Spark, High Performance Spark, etc.):
>>>>>>>>>>>>>>>> https://amzn.to/2MaRAG9
>>>>>>>>>>>>>>>> > YouTube Live Streams:
>>>>>>>>>>>>>>>> https://www.youtube.com/user/holdenkarau
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> John Zhuge
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>
>>>>> --
>>>>> Twitter: https://twitter.com/holdenkarau
>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>
>>>> --
>>> Twitter: https://twitter.com/holdenkarau
>>> Books (Learning Spark, High Performance Spark, etc.):
>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>
>>
>
> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>

Reply via email to