Fwd: Time to start publishing Spark Docker Images?

2023-05-09 Thread Mich Talebzadeh
warded Conversation Subject: Time to start publishing Spark Docker Images? From: Holden Karau Date: Thu, 22 Jul 2021 at 04:13 To: dev Hi Folks, Many other distributed computing (https://hub.docker.com/r/rayproject/ray https://hub.docker.com/u/daskdev) and ASF projects (

Re: Time to start publishing Spark Docker Images?

2022-02-21 Thread Mich Talebzadeh
forwarded view my Linkedin profile https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise

Re: Time to start publishing Spark Docker Images?

2021-08-17 Thread Holden Karau
/opt/spark/work-dir# pip list >>>>>>>>>> Package Version >>>>>>>>>> ----- --- >>>>>>>>>> asn1crypto0.24.0 >>>>>>>>>> cryptography 2.6.1 >>>>

Re: Time to start publishing Spark Docker Images?

2021-08-17 Thread Mich Talebzadeh
3.1.2 >>>>>>>>> pyxdg 0.25 >>>>>>>>> PyYAML5.4.1 >>>>>>>>> SecretStorage 2.3.1 >>>>>>>>> setuptools57.4.0 >>>>>>>>> six 1.12.0 >>>>>>>>&

Re: Time to start publishing Spark Docker Images?

2021-08-17 Thread Andrew Melo
t;>>>>>>1. One vanilla flavour for everyday use with few useful packages >>>>>>>>2. One for medium use with most common packages for ETL/ELT >>>>>>>> stuff >>>>>>>>3. One specialist for ML etc with keras

Re: Time to start publishing Spark Docker Images?

2021-08-17 Thread Mich Talebzadeh
ile >>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>>> >>>>>>> >>>>>>> >>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility >>>>>>&

Re: Time to start publishing Spark Docker Images?

2021-08-17 Thread Andrew Melo
>>>>>>> certainly >>>>>>> see a solid reason to publish like with a jdk11 & jdk8 suffix as well if >>>>>>> there is interest in the community. If we want to have a say >>>>>>> spark-py-pandas for a Spark container i

Re: Time to start publishing Spark Docker Images?

2021-08-17 Thread Mich Talebzadeh
t;>>> directory for Python under Kubernetes >>>>>> >>>>>> /opt/spark/kubernetes/dockerfiles/spark/bindings/python >>>>>> >>>>>> RUN pip install pyyaml numpy cx_Oracle tensorflow >>>>>> >>>>

Re: Time to start publishing Spark Docker Images?

2021-08-17 Thread Mich Talebzadeh
>> >>>>> As I said I am happy to build these specific dockerfiles plus the >>>>> complete documentation for it. I have already built one for Google (GCP). >>>>> The difference between Spark and PySpark version is that in Spark/scala a >>>

Re: Time to start publishing Spark Docker Images?

2021-08-17 Thread Maciej
his email's technical content is >> explicitly disclaimed. The author will in no case be >> liable for any monetary damages arising from such >> loss, damage or destruction. >> >>   >> &

Re: Time to start publishing Spark Docker Images?

2021-08-17 Thread Mich Talebzadeh
inkedin profile >>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>> >>>> >>>> >>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>>> any loss, damage or destruction of data or any ot

Re: Time to start publishing Spark Docker Images?

2021-08-17 Thread Mich Talebzadeh
>>>>>> For example how to add additional Python libraries like tensorflow >>>>>> etc. Loading these libraries through Kubernetes is not practical as >>>>>> unzipping and installing it through --py-files etc will take considerable >>>>

Re: Time to start publishing Spark Docker Images?

2021-08-17 Thread Holden Karau
est the ports from inside the docker >>>>> >>>>> RUN apt-get update && apt-get install -y curl >>>>> RUN ["apt-get","install","-y","vim"] >>>>> >>>>> As I said I am happy to build

Re: Time to start publishing Spark Docker Images?

2021-08-17 Thread Mich Talebzadeh
gt;>>> >>>> >>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>>> any loss, damage or destruction of data or any other property which may >>>> arise from relying on this email's technical content is explicitly

Re: Time to start publishing Spark Docker Images?

2021-08-16 Thread Holden Karau
;>>> >>>>view my Linkedin profile >>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>> >>>> >>>> >>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>>> an

Re: Time to start publishing Spark Docker Images?

2021-08-16 Thread Maciej
erarchy of different > images like this: > >   > > > https://jupyter-docker-stacks.readthedocs.io/en/latest/using/selecting.html#image-relationships > > <https://jupyter-docker-stack

Re: Time to start publishing Spark Docker Images?

2021-08-13 Thread Mich Talebzadeh
;>>> >>>> >>>> I am Meikel Bode and only an interested reader of dev and user list. >>>> Anyway, I would appreciate to have official docker images available. >>>> >>>> Maybe one could get inspiration from the Jupyter docker stacks a

Re: Time to start publishing Spark Docker Images?

2021-08-13 Thread Holden Karau
> >>> >>> I am Meikel Bode and only an interested reader of dev and user list. >>> Anyway, I would appreciate to have official docker images available. >>> >>> Maybe one could get inspiration from the Jupyter docker stacks and >>> provide an hie

Re: Time to start publishing Spark Docker Images?

2021-08-13 Thread Mich Talebzadeh
t; provide an hierarchy of different images like this: >> >> >> >> >> https://jupyter-docker-stacks.readthedocs.io/en/latest/using/selecting.html#image-relationships >> >> >> >> Having a core image only supporting Java, an extended supporting Pyth

Re: Time to start publishing Spark Docker Images?

2021-08-13 Thread Mich Talebzadeh
ml#image-relationships > > > > Having a core image only supporting Java, an extended supporting Python > and/or R etc. > > > > Looking forward to the discussion. > > > > Best, > > Meikel > > > > *From:* Mich Talebzadeh > *Sent:* Freitag, 13

RE: Time to start publishing Spark Docker Images?

2021-08-13 Thread Bode, Meikel, NMA-CFD
to start publishing Spark Docker Images? I concur this is a good idea and certainly worth exploring. In practice, preparing docker images as deployable will throw some challenges because creating docker for Spark is not really a singular modular unit, say creating docker for Jenkins. It involves

Re: Time to start publishing Spark Docker Images?

2021-08-13 Thread Mich Talebzadeh
I concur this is a good idea and certainly worth exploring. In practice, preparing docker images as deployable will throw some challenges because creating docker for Spark is not really a singular modular unit, say creating docker for Jenkins. It involves different versions and different images

Re: Time to start publishing Spark Docker Images?

2021-08-12 Thread Holden Karau
Awesome, I've filed an INFRA ticket to get the ball rolling. On Thu, Aug 12, 2021 at 5:48 PM John Zhuge wrote: > +1 > > On Thu, Aug 12, 2021 at 5:44 PM Hyukjin Kwon wrote: > >> +1, I think we generally agreed upon having it. Thanks Holden for headsup >> and driving this. >> >> +@Dongjoon Hyun

Re: Time to start publishing Spark Docker Images?

2021-08-12 Thread John Zhuge
+1 On Thu, Aug 12, 2021 at 5:44 PM Hyukjin Kwon wrote: > +1, I think we generally agreed upon having it. Thanks Holden for headsup > and driving this. > > +@Dongjoon Hyun FYI > > 2021년 7월 22일 (목) 오후 12:22, Kent Yao 님이 작성: > >> +1 >> >> Bests, >> >> *Kent Yao * >> @ Data Science Center,

Re: Time to start publishing Spark Docker Images?

2021-08-12 Thread Hyukjin Kwon
+1, I think we generally agreed upon having it. Thanks Holden for headsup and driving this. +@Dongjoon Hyun FYI 2021년 7월 22일 (목) 오후 12:22, Kent Yao 님이 작성: > +1 > > Bests, > > *Kent Yao * > @ Data Science Center, Hangzhou Research Institute, NetEase Corp. > *a spark enthusiast* > *kyuubi

Time to start publishing Spark Docker Images?

2021-07-21 Thread Holden Karau
Hi Folks, Many other distributed computing (https://hub.docker.com/r/rayproject/ray https://hub.docker.com/u/daskdev) and ASF projects ( https://hub.docker.com/u/apache) now publish their images to dockerhub. We've already got the docker image tooling in place, I think we'd need to ask the ASF