Re: Topics for Spark online classes & webinars

2023-03-28 Thread Mich Talebzadeh
- Spark internals and/or comparing spark 3 and 2 -- Spark Streaming & Spark Structured Streaming -- Spark on notebooks -- Spark on serverless (for example Spark on Google Cloud) -- Spark on k8s If you are willing to contribute to presentation materials, please register your interest in slack/webinar

Re: Invitation to Join the Fintech Track at Community Over Code 2023 - Share Your Apache Spark Expertise

2023-03-24 Thread Mich Talebzadeh
ort engine etc. In that case, for real life examples one may consider leveraging the Spark user community u...@spark.apache.org as well I trust this helps Mich Talebzadeh, Lead Solutions Architect/Engineering Lead Palantir Technologies Limited view my Linkedin profile <https://www.linkedi

Re: Graceful stop for spark streaming.

2023-03-22 Thread Mich Talebzadeh
Read this [SPARK-42485] SPIP: Shutting down spark structured streaming when the streaming process completed current process - ASF JIRA (apache.org) <https://issues.apache.org/jira/browse/SPARK-42485> <https://issues.apache.org/jira/browse/SPARK-42485>HTH Mich Talebzadeh, Le

Topics for Spark online classes & webinars, next steps

2023-03-21 Thread Mich Talebzadeh
e or @Denny Lee an email stating which topic and at what level you would like to take part. We propose to do a peer review of the draft presentation so no worries. Looking forward to hearing from you. HTH Mich Talebzadeh, Lead Solutions Architect/Engineering Lead Palantir Technologies Lim

Re: Topics for Spark online classes & webinars

2023-03-15 Thread Mich Talebzadeh
Understood Nitin It would be wrong to act against one's conviction. I am sure we can find a way around providing the contents Regards Mich Talebzadeh, Lead Solutions Architect/Engineering Lead Palantir Technologies Limited view my Linkedin profile <https://www.linkedin.com

Re: Topics for Spark online classes & webinars

2023-03-15 Thread Mich Talebzadeh
well. Best of luck. Mich Talebzadeh, Lead Solutions Architect/Engineering Lead, Palantir Technologies Limited view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any a

Re: Topics for Spark online classes & webinars

2023-03-15 Thread Mich Talebzadeh
and contributions are welcome. HTH Mich Talebzadeh, Lead Solutions Architect/Engineering Lead, Palantir Technologies Limited view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at yo

Re: Adding pause() method to pyspark.sql.streaming.StreamingQuery

2023-03-15 Thread Mich Talebzadeh
Hi Martin. Yes, that is the intent. There may be other ways, but I cannot think of. HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsi

Re: Adding pause() method to pyspark.sql.streaming.StreamingQuery

2023-03-14 Thread Mich Talebzadeh
inkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this

Re: Topics for Spark online classes & webinars

2023-03-14 Thread Mich Talebzadeh
Hi Denny, That Apache Spark Linkedin page https://www.linkedin.com/company/apachespark/ looks fine. It also allows a wider audience to benefit from it. +1 for me view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywi

Re: Topics for Spark online classes & webinars

2023-03-13 Thread Mich Talebzadeh
Well that needs to be created first for this purpose. The appropriate name etc. to be decided. Maybe @Denny Lee can facilitate this as he offered his help. cheers view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywi

Re: Topics for Spark online classes & webinars

2023-03-13 Thread Mich Talebzadeh
to is welcome view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which ma

Topics for Spark online classes & webinars

2023-03-13 Thread Mich Talebzadeh
Hi guys To move forward I selected these topics from the thread "Online classes for spark topics". To take this further I propose a confluence page to be seup. Opinions and how to is welcome Cheers view my Linkedin profile <https://www.linkedin.com/in/mich-talebzade

Re: spark executor pod has same memory value for request and limit

2023-03-10 Thread Mich Talebzadeh
agreed. need to be enhanced! HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any

Re: spark executor pod has same memory value for request and limit

2023-03-10 Thread Mich Talebzadeh
I forgot top ask which k8s cluster are you using, assuming some clod vendor view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for an

Re: spark executor pod has same memory value for request and limit

2023-03-10 Thread Mich Talebzadeh
What are those currently set in spark-submit and which spark version on k8s --conf spark.driver.memory=2000m \ --conf spark.executor.memory=2000m \ HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywi

Adding pause() method to pyspark.sql.streaming.StreamingQuery

2023-03-09 Thread Mich Talebzadeh
in a number of occasions? Thanks view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any

Re: [DISCUSS][SPIP] Subexpression elimination supporting more physical operators

2023-03-07 Thread Mich Talebzadeh
Hi Kun, Have you checked the doc procedure for SPIP here in case Spark Project Improvement Proposals (SPIP) | Apache Spark <https://spark.apache.org/improvement-proposals.html> HT view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/&

Re: SPIP architecture diagrams

2023-03-04 Thread Mich Talebzadeh
ication. I have tried to make it generic. However, trademarks are acknowledged . I have tried not to use color but I guess pointers are fair. Let me know your thoughts. Regards view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.e

Fwd: 自动回复: Re: [DISCUSS] Show Python code examples first in Spark documentation

2023-02-26 Thread Mich Talebzadeh
Hi, Can someone disable the below login from spark forums please? Sounds like someone left this email and we are receiving a spam type message anytime we respond. thanks view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywi

Re: [DISCUSS] Show Python code examples first in Spark documentation

2023-02-26 Thread Mich Talebzadeh
if we are going to do it, we might as well do it all. it is more cost effective so to speak. HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any a

Re: [DISCUSS] Show Python code examples first in Spark documentation

2023-02-26 Thread Mich Talebzadeh
with Spark becoming more relevant to ETL plus lesser extend to DS, I see the above order is fair HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any a

Re: SPIP architecture diagrams

2023-02-24 Thread Mich Talebzadeh
considered? Why were they rejected? If no alternatives have been considered, the problem needs more thought.* HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any a

Re: [DISCUSS] Show Python code examples first in Spark documentation

2023-02-23 Thread Mich Talebzadeh
If this is not just flip flopping the document pages and involves other changes, then a proper impact analysis needs to be done to assess the efforts involved. Personally I don't think it really matters. HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-

Re: [DISCUSS] Show Python code examples first in Spark documentation

2023-02-22 Thread Mich Talebzadeh
Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying o

Re: Spark Union performance issue

2023-02-22 Thread Mich Talebzadeh
Hi, Few details will help 1. Spark version 2. Spark SQL, Scala or PySpark 3. size of tables in join. 4. What does explain() or the joining operation show? HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywi

SPIP: Adding work load identity to Spark on Kubernetes documents (supersedes Secret Management)

2023-02-20 Thread Mich Talebzadeh
om" } Cloud service account keys do not expire and require manual rotation. Exporting service account keys has the potential to expand the scope of a security breach if it goes undetected. If an exported key is stolen, an attacker can use it to authenticate as that service account until noti

Re: SPIP: Shutting down spark structured streaming when the streaming process completed current process

2023-02-19 Thread Mich Talebzadeh
Hi Dongjoon., This was an oversight from my side. I confused your involvement with docker build stuff. HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. A

SPIP: Shutting down spark structured streaming when the streaming process completed current process

2023-02-18 Thread Mich Talebzadeh
/list.html?dev@spark.apache.org Thanks. view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any

Re: [VOTE][RESULT][SPIP] Lazy Materialization for Parquet Read Performance Improvement

2023-02-17 Thread Mich Talebzadeh
great view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may

Re: [VOTE][SPIP] Lazy Materialization for Parquet Read Performance Improvement

2023-02-16 Thread Mich Talebzadeh
How many votes are needed for the approval state? Thanks view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, dam

Re: Spark on Kube (virtua) coffee/tea/pop times

2023-02-13 Thread Mich Talebzadeh
quickstart-create-cluster> <https://cloud.google.com/dataproc/docs/guides/dpgke/quickstarts/dataproc-gke-quickstart-create-cluster> Thanks view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaime

Re: [VOTE][SPIP] Lazy Materialization for Parquet Read Performance Improvement

2023-02-13 Thread Mich Talebzadeh
+1 for me view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may

Re: [DISCUSS] SPIP: Lazy Materialization for Parquet Read Performance Improvement

2023-02-13 Thread Mich Talebzadeh
Hi, I thought we already voted to go ahead with this proposal! view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, dam

Re: Spark on Kube (virtua) coffee/tea/pop times

2023-02-10 Thread Mich Talebzadeh
Great looking forward to it Mich view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any

Re: 自动回复: Re: Spark on Kube (virtua) coffee/tea/pop times

2023-02-08 Thread Mich Talebzadeh
Hi, Sounds like we are repeatedly getting out of office messages in Chinese from this member. Is there any way an admin disables this account To save usfrom being spammed? Thanks view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>

Re: Spark on Kube (virtua) coffee/tea/pop times

2023-02-08 Thread Mich Talebzadeh
That sounds like a good plan Holden! Let us go for it view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destr

Re: Spark on Kube (virtua) coffee/tea/pop times

2023-02-08 Thread Mich Talebzadeh
Hi all, Is this going to be a brainstorming meeting or there will be a prior agenda to work around it? thanks view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own ris

Re: Spark on Kube (virtua) coffee/tea/pop times

2023-02-08 Thread Mich Talebzadeh
Ok Colin thanks for clarification view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any

Re: Spark on Kube (virtua) coffee/tea/pop times

2023-02-08 Thread Mich Talebzadeh
Hi Colin, Thanks for your reply. I think both Yarn and Kubernetes are cluster managers plus Standalone and Remotely Mesos. So I gather this discussion will focus on Spark on k8s unless I am mistaken. HTH, Mich view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-p

Re: Spark on Kube (virtua) coffee/tea/pop times

2023-02-08 Thread Mich Talebzadeh
Hi Colin, Environments meaning different (cloud) vendors or hosts? view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss,

Re: Spark on Kube (virtua) coffee/tea/pop times

2023-02-07 Thread Mich Talebzadeh
if needed HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property whi

Fwd: Shutting down spark structured streaming when the streaming process completed current process

2023-02-07 Thread Mich Talebzadeh
Resending this feature request and proposing a possible solution Can some advise if I need to complete Spark project improvement proposal <https://spark.apache.org/improvement-proposals.html> Thanks view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph

Re: [DISCUSS] SPIP: Lazy Materialization for Parquet Read Performance Improvement

2023-02-01 Thread Mich Talebzadeh
s possible. HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which

Re: [DISCUSS] SPIP: Lazy Materialization for Parquet Read Performance Improvement

2023-02-01 Thread Mich Talebzadeh
+1 view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may aris

Re: [DISCUSS] SPIP: Lazy Materialization for Parquet Read Performance Improvement

2023-01-31 Thread Mich Talebzadeh
plate. > While we are open to revise our design doc, it seems more like you are > proposing the community to change the instruction per se? > > Kazu > > On Jan 31, 2023, at 11:24 AM, Mich Talebzadeh > wrote: > > Hi, > > Thanks for these proposals. good suggestio

Re: [DISCUSS] SPIP: Lazy Materialization for Parquet Read Performance Improvement

2023-01-31 Thread Mich Talebzadeh
/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly

Re: Re: spark+kafka+dynamic resource allocation

2023-01-30 Thread Mich Talebzadeh
Sure, I suggest that you add a note to that Jira and express your interest. HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility f

Update to Spark Kubernetes docs for secrets

2023-01-23 Thread Mich Talebzadeh
ount.com", "client_id": "123", "auth_uri": "https://accounts.google.com/o/oauth2/auth";, "token_uri": "https://oauth2.googleapis.com/token";, "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2

Published docker images vulnerabilities scan

2023-01-23 Thread Mich Talebzadeh
/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly

Re: Jupyter notebook on Dataproc versus GKE

2022-09-06 Thread Mich Talebzadeh
book, terminal, text editor, file browser, rich outputs, etc.) in a > flexible and powerful user interface.*" > https://github.com/jupyterlab/jupyterlab > > You will find them both at https://jupyter.org > > man. 5. sep. 2022 kl. 23:40 skrev Mich Talebzadeh < > mi

Re: Jupyter notebook on Dataproc versus GKE

2022-09-05 Thread Mich Talebzadeh
Thanks Bjorn, What are the differences and the functionality Jupyerlab brings in on top of Jupyter notebook? view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. A

Re: Jupyter notebook on Dataproc versus GKE

2022-09-05 Thread Mich Talebzadeh
BigQuery. What does the Jupyter notebook offer that others don't? view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss

Jupyter notebook on Dataproc versus GKE

2022-09-05 Thread Mich Talebzadeh
with Volcano. Has progress made on that front. Regards, Mich view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, dam

Re: [DISCUSS] SPIP: Spark Connect - A client and server interface for Apache Spark.

2022-06-05 Thread Mich Talebzadeh
e API can be used with (almost) any > programming language. > > We would like to start a discussion on the document and any feedback is > welcome! > > Thanks a lot in advance, > Martin > -- view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205

Re: Deluge of GitBox emails

2022-04-04 Thread Mich Talebzadeh
+1 as well receiving :) view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other pr

Re: Tools for regression testing

2022-03-24 Thread Mich Talebzadeh
likely impact. Integration testing can be achieved through CI/CD which I believe Spark relied on Jenkins until recently. HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at yo

Re: Tools for regression testing

2022-03-24 Thread Mich Talebzadeh
cheers, Mich view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which ma

Re: [DISCUSS] Migration guide on upgrading Kafka to 3.1 in Spark 3.3

2022-03-23 Thread Mich Talebzadeh
, this should work. In reality Kafka will be running on its own container(s) plus the zookeeper containers if any. So as far as i can ascertain, the discussion is about deploying the compatible APIs HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>

Tools for regression testing

2022-03-21 Thread Mich Talebzadeh
Hi, As a matter of interest do Spark releases deploy a specific regression testing tool? Thanks view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any a

Re: Spark Streaming | Dynamic Action Support

2022-03-03 Thread Mich Talebzadeh
e the thread "How to gracefully shutdown Spark Structured Streaming" in https://lists.apache.org/list.html?u...@spark.apache.org HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer

Re: Spark Streaming | Dynamic Action Support

2022-03-03 Thread Mich Talebzadeh
What is the definition of action here? view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data

Re: Reason/rationale for using a service for the driver when running Spark on Kubernetes?

2022-03-02 Thread Mich Talebzadeh
Hi Simon, These are relatively old documents. Specifically, what issues are you foreseeing because of the driver service? HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it a

Re: Recap on current status of "SPIP: Support Customized Kubernetes Schedulers"

2022-02-25 Thread Mich Talebzadeh
ks \ --addons HorizontalPodAutoscaling,HttpLoadBalancing,GcePersistentDiskCsiDriver \ --enable-autoupgrade \ --enable-autorepair \ --max-surge-upgrade 1 \ --max-unavailable-upgrade 0 \ --enable-shielded-nodes \ --node-locations "europe-west2-c" view my Linkedin profile &l

Re: Recap on current status of "SPIP: Support Customized Kubernetes Schedulers"

2022-02-25 Thread Mich Talebzadeh
or FEATURES=”org.apache.spark.deploy.k8s.features.VolcanoFeatureStep”, could that be one reason? Regards, Mich view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage

Re: Recap on current status of "SPIP: Support Customized Kubernetes Schedulers"

2022-02-24 Thread Mich Talebzadeh
ken for the volcano runs. Any comments are welcome. HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or

Re: Recap on current status of "SPIP: Support Customized Kubernetes Schedulers"

2022-02-24 Thread Mich Talebzadeh
Volcano Thanks view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may

Re: [Fork] ]RE: One click to run Spark on Kubernetes

2022-02-23 Thread Mich Talebzadeh
Thanks Janak, the same as GKE conventional or GKE autopilot. <https://cloud.google.com/kubernetes-engine> Putting conventional aside, why do you think customers should choose a fully managed package* for Spark*? thanks view my Linkedin profile <https://www.linkedin.com/in/mich-t

Re: [Fork] ]RE: One click to run Spark on Kubernetes

2022-02-23 Thread Mich Talebzadeh
Hi Janak, Are you talking about EKS Fargate? Thanks view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destr

Spark 3.1.3 docker pre-built with Python Data science packages

2022-02-23 Thread Mich Talebzadeh
4.2 wrapt1.13.3 xgboost 1.5.2 zipp 3.7.0 zope.interface 5.4.0 Let me know how it works for you. view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com

Re: One click to run Spark on Kubernetes

2022-02-22 Thread Mich Talebzadeh
ofile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email&

Re: [ANNOUNCE] Apache Spark 3.1.3 released + Docker images

2022-02-22 Thread Mich Talebzadeh
ues.apache.org/jira/browse/SPARK-36339>: >>Memory leak in ExecutorAllocationListener breaks dynamic allocation under >>high load >> >> Links to wrong jira ticket? >> >> Mich Talebzadeh 于2022年2月22日周二 15:49写道: >> >>> Well, that is pretty easy to do. &

Re: [ANNOUNCE] Apache Spark 3.1.3 released + Docker images

2022-02-21 Thread Mich Talebzadeh
-8-jre-slim-buster HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other pr

Re: [ANNOUNCE] Apache Spark 3.1.3 released + Docker images

2022-02-21 Thread Mich Talebzadeh
d JAVA 11 respectively I believe it is a good thing and we ought to adopt that convention. For example: spark-py-3.2.1-scala_2.12-11-jre-slim-buster HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mic

Re: Time to start publishing Spark Docker Images?

2022-02-21 Thread Mich Talebzadeh
forwarded view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may

Re: [ANNOUNCE] Apache Spark 3.1.3 released + Docker images

2022-02-21 Thread Mich Talebzadeh
well that docker link is not found! may be permission issue [image: image.png] view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility f

Re: Docker images for Spark 3.1.1 and Spark 3.1.2 with Java 11 and Java 8 from docker hub

2022-02-20 Thread Mich Talebzadeh
Added dockers for Spark 3.2.1 with default11-jre-slim-buster for spark and spark-py HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsi

Docker images for Spark 3.1.1 and Spark 3.1.2 with Java 11 and Java 8 from docker hub

2022-02-20 Thread Mich Talebzadeh
8-jre-slim-buster Let me know if any issues HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destructio

Re: Deploying Spark on Google Kubernetes (GKE) autopilot, preliminary findings

2022-02-14 Thread Mich Talebzadeh
d one just wants to deploy it on the GCP, so even if, say, the Beam programming mode/dataflow is superior to Hadoop, someone with a lot of Hadoop code might still choose Dataproc or GDE for the time being, rather than rewriting their code on Beam to run on Dataflow. HTH view my Linkedin pr

Deploying Spark on Google Kubernetes (GKE) autopilot, preliminary findings

2022-02-11 Thread Mich Talebzadeh
and I appreciate feedback from other members that have experimented with GKE autopilot or AWS Fargate or are familiar with k8s. Thanks view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibili

Re: ASF board report draft for February 2022

2022-02-08 Thread Mich Talebzadeh
QX6-QH_J9YV2C2Dh6RpXefUpLM7KGkzL6Fg> and 2) Storage Partitioned Join for Data Source V2 <https://docs.google.com/document/d/1foTkDSM91VxKgkEcBMsuAvEjNybjja-uHk-r3vtXWFE/edit#heading=h.82w8qxfl2uwl> HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/&

Re: spark, autoscaling and handling node loss with autoscaling

2022-02-05 Thread Mich Talebzadeh
that autoscaling is only applied to workload at a clean state. HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other prope

spark, autoscaling and handling node loss with autoscaling

2022-02-05 Thread Mich Talebzadeh
es? view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicit

Re: Spark on K8s : property simillar to yarn.max.application.attempt

2022-02-04 Thread Mich Talebzadeh
Not as far as I know. If your driver pod fails, then you need to resubmit the job. I cannot see what else can be done? HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibility for an

Re: Custom structured streaming source for a system with non-repeatable reads?

2022-02-01 Thread Mich Talebzadeh
e there is a business value proposition to use Pub/Sub in the whole chain. HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other

Re: Custom structured streaming source for a system with non-repeatable reads?

2022-02-01 Thread Mich Talebzadeh
it you are trying to make Pub/Sub behave like SSS. If that is the case, then will Pub/Sub still be required to pass the topics? HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibility for

Re: Spark on Oracle available as an Apache licensed open source repo

2022-01-14 Thread Mich Talebzadeh
e. Cheers view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical

Re: [DISCUSSION] SPIP: Support Volcano/Alternative Schedulers Proposal

2022-01-06 Thread Mich Talebzadeh
Performance optimization for high-performance workloads " If we agree on the above points mentioned by @William Wang, I propose that they should be incorporated into the doc HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at y

Re: [VOTE][SPIP] Support Customized Kubernetes Schedulers Proposal

2022-01-06 Thread Mich Talebzadeh
+1 (non-binding) view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email'

Re: [DISCUSSION] SPIP: Support Volcano/Alternative Schedulers Proposal

2022-01-05 Thread Mich Talebzadeh
+1 non-binding view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email'

Re: [DISCUSSION] SPIP: Support Volcano/Alternative Schedulers Proposal

2022-01-05 Thread Mich Talebzadeh
u want to fit exactly one Spark executor pod per Kubernetes node with the current model. HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of da

Re: [DISCUSSION] SPIP: Support Volcano/Alternative Schedulers Proposal

2022-01-05 Thread Mich Talebzadeh
abilities that users care about. > - Full lifecycle management for jobs > - Scheduling policies for high-performance workloads(fair-share, topology, > sla, reservation, preemption, backfill etc) > - Support for heterogeneous hardware > - Performance optimization for high-performance work

Re: Time for Spark 3.2.1?

2022-01-04 Thread Mich Talebzadeh
+1 non-binding view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email'

Re: [DISCUSSION] SPIP: Support Volcano/Alternative Schedulers Proposal

2022-01-04 Thread Mich Talebzadeh
Interesting,thanks Do you have any indication of the ballpark figure (a rough numerical estimate) of adding Volcano as an alternative scheduler is going to improve Spark on k8s performance? Thanks view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-520

Re: Sizing the driver & executor cores and memory in Kubernetes cluster

2021-12-16 Thread Mich Talebzadeh
ning FailedScheduling 17m default-scheduler 0/3 nodes are available: 3 Insufficient memory. Normal NotTriggerScaleUp 2m28s (x92 over 17m) cluster-autoscaler pod didn't trigger scale-up: HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh

Sizing the driver & executor cores and memory in Kubernetes cluster

2021-12-14 Thread Mich Talebzadeh
time. It would be interesting if others have done similar configuration and their experience. view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any o

Re: Hi Team, I put a UDF-Utils jar on Google Cloud Storage, but I can't run it

2021-12-13 Thread Mich Talebzadeh
din profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The aut

Re: In Kubernetes Must specify the driver container image

2021-12-10 Thread Mich Talebzadeh
Thanks Rob for spotting the error! view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying o

In Kubernetes Must specify the driver container image

2021-12-10 Thread Mich Talebzadeh
le *Exception in thread "main" org.apache.spark.SparkException: Must specify the driver container image* Sounds like that regardless you still have to specify the container image explicitly HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>

Re: docker image distribution in Kubernetes cluster

2021-12-09 Thread Mich Talebzadeh
3.1.1-scala_2.12-8-jre-slim-buster-java8WithPyyaml and the executors will have 3.1.1-scala_2.12-8-jre-slim-buster-addedpackages with the additional packages. Cheers view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own ris

<    1   2   3   4   >