Re: Spark on Kubernetes

2024-04-30 Thread Mich Talebzadeh
ia.org/wiki/Wernher_von_Braun>)". On Tue, 30 Apr 2024 at 04:29, Tarun raghav wrote: > Respected Sir/Madam, > I am Tarunraghav. I have a query regarding spark on kubernetes. > > We have an eks cluster, within which we have spark installed in the pods. > We set the executor memory

Spark on Kubernetes

2024-04-29 Thread Tarun raghav
Respected Sir/Madam, I am Tarunraghav. I have a query regarding spark on kubernetes. We have an eks cluster, within which we have spark installed in the pods. We set the executor memory as 1GB and set the executor instances as 2, I have also set dynamic allocation as true. So when I try to read a

Re: [Spark on Kubernetes]: Seeking Guidance on Handling Persistent Executor Failures

2024-02-19 Thread Mich Talebzadeh
Thanks for your kind words Sri Well it is true that as yet spark on kubernetes is not on-par with spark on YARN in maturity and essentially spark on kubernetes is still work in progress.* So in the first place IMO one needs to think why executors are failing. What causes this behaviour? Is it the

Re: [Spark on Kubernetes]: Seeking Guidance on Handling Persistent Executor Failures

2024-02-19 Thread Cheng Pan
On Feb 19, 2024, at 23:59, Sri Potluri wrote: > > Hello Spark Community, > > I am currently leveraging Spark on Kubernetes, managed by the Spark Operator, > for running various Spark applications. While the system generally works > well, I've encountered a chal

Re: [Spark on Kubernetes]: Seeking Guidance on Handling Persistent Executor Failures

2024-02-19 Thread Sri Potluri
eh-ph-d-5205b2/> >> >> >> https://en.everybodywiki.com/Mich_Talebzadeh >> >> >> >> *Disclaimer:* The information provided is correct to the best of my >> knowledge but of course cannot be guaranteed . It is essential to note >> that, as with a

Re: [Spark on Kubernetes]: Seeking Guidance on Handling Persistent Executor Failures

2024-02-19 Thread Mich Talebzadeh
s worth one-thousand > expert opinions (Werner <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von > Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". > > > On Mon, 19 Feb 2024 at 18:34, Sri Potluri wrote: > >> Hello Spark Community, >>

Re: [Spark on Kubernetes]: Seeking Guidance on Handling Persistent Executor Failures

2024-02-19 Thread Mich Talebzadeh
;Von Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". On Mon, 19 Feb 2024 at 18:34, Sri Potluri wrote: > Hello Spark Community, > > I am currently leveraging Spark on Kubernetes, managed by the Spark > Operator, for running various Spark applications. While

[Spark on Kubernetes]: Seeking Guidance on Handling Persistent Executor Failures

2024-02-19 Thread Sri Potluri
Hello Spark Community, I am currently leveraging Spark on Kubernetes, managed by the Spark Operator, for running various Spark applications. While the system generally works well, I've encountered a challenge related to how Spark applications handle executor failures, specifically in scen

Re: Regarding Spark on Kubernetes(EKS)

2024-02-19 Thread Jagannath Majhi
k at this article of mine > > Spark on Kubernetes, A Practitioner’s Guide > <https://www.linkedin.com/pulse/spark-kubernetes-practitioners-guide-mich-talebzadeh-ph-d-%3FtrackingId=Wsu3lkoPaCWqGemYHe8%252BLQ%253D%253D/?trackingId=Wsu3lkoPaCWqGemYHe8%2BLQ%3D%3D> > > HTH >

Re: Regarding Spark on Kubernetes(EKS)

2024-02-19 Thread Jagannath Majhi
I am not using any private docker image. Only I am running the jar file in EMR using spark-submit command so now I want to run this jar file in eks so can you please tell me how can I set-up for this ?? On Mon, Feb 19, 2024, 8:06 PM Jagannath Majhi < jagannath.ma...@cloud.cbnits.com> wrote: > Can

Re: Regarding Spark on Kubernetes(EKS)

2024-02-19 Thread Mich Talebzadeh
Sure but first it would be beneficial to understand the way Spark works on Kubernetes and the concept.s Have a look at this article of mine Spark on Kubernetes, A Practitioner’s Guide <https://www.linkedin.com/pulse/spark-kubernetes-practitioners-guide-mich-talebzadeh-ph-d-%3Ftrackin

Re: Regarding Spark on Kubernetes(EKS)

2024-02-19 Thread Mich Talebzadeh
OK you have a jar file that you want to work with when running using Spark on k8s as the execution engine (EKS) as opposed to YARN on EMR as the execution engine? Mich Talebzadeh, Dad | Technologist | Solutions Architect | Engineer London United Kingdom view my Linkedin profile

Re: Regarding Spark on Kubernetes(EKS)

2024-02-19 Thread Mich Talebzadeh
Where is your docker file? In ECR container registry. If you are going to use EKS, then it need to be accessible to all nodes of cluster When you build your docker image, put your jar under the $SPARK_HOME directory. Then add a line to your docker build file as below Here I am accessing Google Big

Re: Regarding Spark on Kubernetes(EKS)

2024-02-19 Thread Richard Smith
I run my Spark jobs in GCP with Google Dataproc using GCS buckets. I've not used AWS, but its EMR product offers similar functionality to Dataproc. The title of your post implies your Spark cluster runs on EKS. You might be better off using EMR, see links below: EMR https://medium.com/big-da

Regarding Spark on Kubernetes(EKS)

2024-02-19 Thread Jagannath Majhi
Dear Spark Community, I hope this email finds you well. I am reaching out to seek assistance and guidance regarding a task I'm currently working on involving Apache Spark. I have developed a JAR file that contains some Spark applications and functionality, and I need to run this JAR file within a

Re: Seeking Guidance on Spark on Kubernetes Secrets Configuration

2023-10-01 Thread Jon Rodríguez Aranguren
unity Members, >> >> I trust this message finds you all in good health and spirits. >> >> I'm reaching out to the collective expertise of this esteemed community >> with a query regarding Spark on Kubernetes. As a newcomer, I have always >> admired the dep

Re: Seeking Guidance on Spark on Kubernetes Secrets Configuration

2023-10-01 Thread Jörn Franke
x27;m reaching out to the collective expertise of this esteemed community with a query regarding Spark on Kubernetes. As a newcomer, I have always admired the depth and breadth of knowledge shared within this forum, and it is my hope that some of you might have insights on a specific challenge I'm f

Re: Seeking Guidance on Spark on Kubernetes Secrets Configuration

2023-10-01 Thread Jörn Franke
schrieb Jon Rodríguez Aranguren <jon.r.arangu...@gmail.com>:Dear Spark Community Members,I trust this message finds you all in good health and spirits.I'm reaching out to the collective expertise of this esteemed community with a query regarding Spark on Kubernetes. As a newcomer, I ha

Re: Seeking Guidance on Spark on Kubernetes Secrets Configuration

2023-10-01 Thread Mich Talebzadeh
lt; >> jon.r.arangu...@gmail.com>: >> >>  >> Dear Spark Community Members, >> >> I trust this message finds you all in good health and spirits. >> >> I'm reaching out to the collective expertise of this esteemed community >> with

Re: Seeking Guidance on Spark on Kubernetes Secrets Configuration

2023-09-30 Thread Jayabindu Singh
t; Dear Spark Community Members, > > I trust this message finds you all in good health and spirits. > > I'm reaching out to the collective expertise of this esteemed community > with a query regarding Spark on Kubernetes. As a newcomer, I have always > admired the depth and

Re: Seeking Guidance on Spark on Kubernetes Secrets Configuration

2023-09-30 Thread Jörn Franke
uez Aranguren > : > >  > Dear Spark Community Members, > > I trust this message finds you all in good health and spirits. > > I'm reaching out to the collective expertise of this esteemed community with > a query regarding Spark on Kubernetes. As a newcomer, I h

Seeking Guidance on Spark on Kubernetes Secrets Configuration

2023-09-29 Thread Jon Rodríguez Aranguren
Dear Spark Community Members, I trust this message finds you all in good health and spirits. I'm reaching out to the collective expertise of this esteemed community with a query regarding Spark on Kubernetes. As a newcomer, I have always admired the depth and breadth of knowledge shared w

SPIP: Adding work load identity to Spark on Kubernetes documents (supersedes Secret Management)

2023-02-20 Thread Mich Talebzadeh
Hi, I would like to propose that the current Secret Management in Spark Kubernetes documentation to include the more secure credentials Workload identity) for Spark to access Kubernetes services. Both Google Clou

Re: Running Spark on Kubernetes (GKE) - failing on spark-submit

2023-02-16 Thread Mich Talebzadeh
gt;> It may help to check this article of mine >> >> >> Spark on Kubernetes, A Practitioner’s Guide >> <https://www.linkedin.com/pulse/spark-kubernetes-practitioners-guide-mich-talebzadeh-ph-d-/?trackingId=FDQORri0TBeJl02p3D%2B2JA%3D%3D> >> >> >> H

Re: Running Spark on Kubernetes (GKE) - failing on spark-submit

2023-02-15 Thread karan alang
thnks, Mich .. let me check this On Wed, Feb 15, 2023 at 1:42 AM Mich Talebzadeh wrote: > > It may help to check this article of mine > > > Spark on Kubernetes, A Practitioner’s Guide > <https://www.linkedin.com/pulse/spark-kubernetes-practitioners-guide-mich-talebza

Re: Running Spark on Kubernetes (GKE) - failing on spark-submit

2023-02-15 Thread Mich Talebzadeh
It may help to check this article of mine Spark on Kubernetes, A Practitioner’s Guide <https://www.linkedin.com/pulse/spark-kubernetes-practitioners-guide-mich-talebzadeh-ph-d-/?trackingId=FDQORri0TBeJl02p3D%2B2JA%3D%3D> HTH view my Linkedin profile <https://www.linkedin.co

Re: Running Spark on Kubernetes (GKE) - failing on spark-submit

2023-02-15 Thread Mich Talebzadeh
Your submit command spark-submit --master k8s://https://34.74.22.140:7077 --deploy-mode cluster --name pyspark-example --conf spark.kubernetes.container.image=pyspark-example:0.1 --conf spark.kubernetes.file.upload.path=/myexample src/StructuredStream-on-gke.py pay attention to what it says --

Re: Running Spark on Kubernetes (GKE) - failing on spark-submit

2023-02-14 Thread karan alang
Hi Ye, This is the error i get when i don't set the spark.kubernetes.file.upload.path Any ideas on how to fix this ? ``` Exception in thread "main" org.apache.spark.SparkException: Please specify spark.kubernetes.file.upload.path property. at org.apache.spark.deploy.k8s.KubernetesUtils$.upload

Re: Running Spark on Kubernetes (GKE) - failing on spark-submit

2023-02-14 Thread Ye Xianjin
The configuration of ‘…file.upload.path’ is wrong. it means a distributed fs path to store your archives/resource/jars temporarily, then distributed by spark to drivers/executors. For your cases, you don’t need to set this configuration.Sent from my iPhoneOn Feb 14, 2023, at 5:43 AM, karan alang w

Re: Running Spark on Kubernetes (GKE) - failing on spark-submit

2023-02-14 Thread Khalid Mammadov
I am not k8s expert but I think you got permission issue. Try 777 as an example to see if it works. On Mon, 13 Feb 2023, 21:42 karan alang, wrote: > Hello All, > > I'm trying to run a simple application on GKE (Kubernetes), and it is > failing: > Note : I have spark(bitnami spark chart) installe

Running Spark on Kubernetes (GKE) - failing on spark-submit

2023-02-13 Thread karan alang
Hello All, I'm trying to run a simple application on GKE (Kubernetes), and it is failing: Note : I have spark(bitnami spark chart) installed on GKE using helm install Here is what is done : 1. created a docker image using Dockerfile Dockerfile : ``` FROM python:3.7-slim RUN apt-get update && \

Re: spark on kubernetes

2022-10-16 Thread Qian Sun
Glad to hear it! On Sun, Oct 16, 2022 at 2:37 PM Mohammad Abdollahzade Arani < mamadazar...@gmail.com> wrote: > Hi Qian, > Thanks for the reply and I'm So sorry for the late reply. > I found the answer. My mistake was token conversion. I had to decode > base64 the service accounts token and cert

Re: spark on kubernetes

2022-10-15 Thread Qian Sun
Hi Mohammad Did you try this command? ./bin/spark-submit \ --master k8s://https://vm13:6443 \ --class com.example.WordCounter \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=default \ --conf spark.kubernetes.container.image=private-docker-registery/spark/spark:3.2.1-3

spark on kubernetes

2022-10-15 Thread Mohammad Abdollahzade Arani
I have a k8s cluster and a spark cluster. my question is is as bellow: https://stackoverflow.com/questions/74053948/how-to-resolve-pods-is-forbidden-user-systemanonymous-cannot-watch-resourc I have searched and I found lot's of other similar questions on stackoverflow without an answer like bel

Re: One click to run Spark on Kubernetes

2022-02-23 Thread bo yang
mit. >>>>> However, that depends on setting up access permission, use of service >>>>> accounts, pulling the correct dockerfiles for the driver and the >>>>> executors. >>>>> Those details add to the complexity. >>>>> >

Re: One click to run Spark on Kubernetes

2022-02-23 Thread Bjørn Jørgensen
;>>> >>>> Thanks >>>> >>>> >>>> >>>>view my Linkedin profile >>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>> >>>> >>>> https://en.everybodywiki.com/M

Re: One click to run Spark on Kubernetes

2022-02-23 Thread bo yang
gt;>> >>> >>> https://en.everybodywiki.com/Mich_Talebzadeh >>> >>> >>> >>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>> any loss, damage or destruction of data or any other property which may >>>

Re: One click to run Spark on Kubernetes

2022-02-23 Thread Sarath Annareddy
he complexity. >>>> >>>> Thanks >>>> >>>> >>>>view my Linkedin profile >>>> >>>> >>>> >>>> https://en.everybodywiki.com/Mich_Talebzadeh >>>> >>>> >>&g

Re: One click to run Spark on Kubernetes

2022-02-23 Thread bo yang
everybodywiki.com/Mich_Talebzadeh >> >> >> >> *Disclaimer:* Use it at your own risk. Any and all responsibility for >> any loss, damage or destruction of data or any other property which may >> arise from relying on this email's technical content is explicitly

Re: One click to run Spark on Kubernetes

2022-02-23 Thread Sarath Annareddy
any other property which may arise >> from relying on this email's technical content is explicitly disclaimed. The >> author will in no case be liable for any monetary damages arising from such >> loss, damage or destruction. >> >> >> >>>

Re: One click to run Spark on Kubernetes

2022-02-23 Thread Bitfox
or any other property which may >> arise from relying on this email's technical content is explicitly >> disclaimed. The author will in no case be liable for any monetary damages >> arising from such loss, damage or destruction. >> >> >> >> >> On Wed, 23 Fe

Re: One click to run Spark on Kubernetes

2022-02-23 Thread bo yang
no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Wed, 23 Feb 2022 at 04:06, bo yang wrote: > >> Hi Spark Community, >> >> We built an open source tool to deploy and run Spark on Kubernetes with a >>

Re: One click to run Spark on Kubernetes

2022-02-22 Thread Mich Talebzadeh
#x27;s technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Wed, 23 Feb 2022 at 04:06, bo yang wrote: > Hi Spark Community, > > We built an open source tool to deploy and run Spark on Kube

Re: One click to run Spark on Kubernetes

2022-02-22 Thread bo yang
;>> The one click tool intends to hide these details, so people could just >>>> submit Spark and do not need to deal with too many deployment details. >>>> >>>> On Tue, Feb 22, 2022 at 8:09 PM Bitfox wrote: >>>> >>>>> Can it be a clu

Re: One click to run Spark on Kubernetes

2022-02-22 Thread amihay gonen
do not need to deal with too many deployment details. >>> >>> On Tue, Feb 22, 2022 at 8:09 PM Bitfox wrote: >>> >>>> Can it be a cluster installation of spark? or just the standalone node? >>>> >>>> Thanks >>>> >>>&g

Re: One click to run Spark on Kubernetes

2022-02-22 Thread bo yang
one node? >>> >>> Thanks >>> >>> On Wed, Feb 23, 2022 at 12:06 PM bo yang wrote: >>> >>>> Hi Spark Community, >>>> >>>> We built an open source tool to deploy and run Spark on Kubernetes with >>>> a one

Re: One click to run Spark on Kubernetes

2022-02-22 Thread Prasad Paravatha
12:06 PM bo yang wrote: >> >>> Hi Spark Community, >>> >>> We built an open source tool to deploy and run Spark on Kubernetes with >>> a one click command. For example, on AWS, it could automatically create an >>> EKS cluster, node group, NGINX i

Re: One click to run Spark on Kubernetes

2022-02-22 Thread Bitfox
ster installation of spark? or just the standalone node? >> >> Thanks >> >> On Wed, Feb 23, 2022 at 12:06 PM bo yang wrote: >> >>> Hi Spark Community, >>> >>> We built an open source tool to deploy and run Spark on Kubernetes with >>> a on

Re: One click to run Spark on Kubernetes

2022-02-22 Thread bo yang
ion of spark? or just the standalone node? > > Thanks > > On Wed, Feb 23, 2022 at 12:06 PM bo yang wrote: > >> Hi Spark Community, >> >> We built an open source tool to deploy and run Spark on Kubernetes with a >> one click command. For example, on AWS, it co

Re: One click to run Spark on Kubernetes

2022-02-22 Thread Bitfox
Can it be a cluster installation of spark? or just the standalone node? Thanks On Wed, Feb 23, 2022 at 12:06 PM bo yang wrote: > Hi Spark Community, > > We built an open source tool to deploy and run Spark on Kubernetes with a > one click command. For example, on AWS, it could a

One click to run Spark on Kubernetes

2022-02-22 Thread bo yang
Hi Spark Community, We built an open source tool to deploy and run Spark on Kubernetes with a one click command. For example, on AWS, it could automatically create an EKS cluster, node group, NGINX ingress, and Spark Operator. Then you will be able to use curl or a CLI tool to submit Spark

Re: Spark on Kubernetes scheduler variety

2021-07-08 Thread Mich Talebzadeh
I >>>> may say I doubt whether such an approach and the so-called democratization >>>> of Spark on whatever platform is really should be of great focus. >>>> >>>> Having worked on Google Dataproc <https://cloud.google.com/dataproc> (A >>

Re: Spark on Kubernetes scheduler variety

2021-07-08 Thread Holden Karau
ort plus >>> talent on batch scheduling on Kubernetes could be rewarding. However, if I >>> may say I doubt whether such an approach and the so-called democratization >>> of Spark on whatever platform is really should be of great focus. >>> >>> Having wor

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Mich Talebzadeh
Hi Holden, Thank you for your points. I guess coming from a corporate world I had an oversight on how an open source project like Spark does leverage resources and interest :). As @KlausMa kindly volunteered it would be good to hear scheduling ideas on Spark on Kubernetes and of course as I am

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Holden Karau
so-called democratization >> of Spark on whatever platform is really should be of great focus. >> >> Having worked on Google Dataproc <https://cloud.google.com/dataproc> (A fully >> managed and highly scalable service for running Apache Spark, Hadoop and >> more rec

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Lalwani, Jayesh
You can always chain aggregations by chaining multiple Structured Streaming jobs. It’s not a showstopper. Getting Spark on Kubernetes is important for organizations that want to pursue a multi-cloud strategy From: Mich Talebzadeh Date: Wednesday, June 23, 2021 at 11:27 AM To: "user @

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread John Zhuge
gle Dataproc <https://cloud.google.com/dataproc> (A fully >> managed and highly scalable service for running Apache Spark, Hadoop and >> more recently other artefacts) for that past two years, and Spark on >> Kubernetes on-premise, I have come to the conclusion that Spark is not

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Mich Talebzadeh
om/dataproc> (A fully >> managed and highly scalable service for running Apache Spark, Hadoop and >> more recently other artefacts) for that past two years, and Spark on >> Kubernetes on-premise, I have come to the conclusion that Spark is not a >> beast that that one

Re: Spark on Kubernetes scheduler variety

2021-06-23 Thread Klaus Ma
t focus. > > Having worked on Google Dataproc <https://cloud.google.com/dataproc> (A fully > managed and highly scalable service for running Apache Spark, Hadoop and > more recently other artefacts) for that past two years, and Spark on > Kubernetes on-premise, I have come to

Re: Spark on Kubernetes scheduler variety

2021-06-23 Thread Mich Talebzadeh
democratization of Spark on whatever platform is really should be of great focus. Having worked on Google Dataproc <https://cloud.google.com/dataproc> (A fully managed and highly scalable service for running Apache Spark, Hadoop and more recently other artefacts) for that past two years, and Sp

Re: Question on spark on Kubernetes

2021-05-20 Thread Gourav Sengupta
Hi Mithalee, lets start with why, Why are you using Kubernetes and not just EMR in EC2? Do you have extremely bespoke library dependencies and requirements? Or does you workloads fail in case the clusters do not scale up or down in a few minutes? Regards, Gourav Sengupta On Thu, May 20, 2021 at

Question on spark on Kubernetes

2021-05-20 Thread Mithalee Mohapatra
Hi, I am currently trying to run spark submit in Kubernetes. I have set up the IAM roles for serviceaccount and generated the ARN. I am trying to use the "spark.hadoop.fs.s3a.fast.upload=true --conf spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.WebIdentityTokenCredentialsProvider"

Re: Dynamic Allocation Backlog Property in Spark on Kubernetes

2021-04-10 Thread ranju goel
ing schedulerbacklogtimeout (say 15 mins) and speeds up the >> job. >> >> >> [image: image.png] >> >> Best Regards >> >> >> >> >> >> *From:* Attila Zsolt Piros >> *Sent:* Friday, April 9, 2021 11:11 AM >> *To:* R

Re: Dynamic Allocation Backlog Property in Spark on Kubernetes

2021-04-10 Thread Attila Zsolt Piros
rds > > > > > > *From:* Attila Zsolt Piros > *Sent:* Friday, April 9, 2021 11:11 AM > *To:* Ranju Jain > *Cc:* user@spark.apache.org > *Subject:* Re: Dynamic Allocation Backlog Property in Spark on Kubernetes > > > > You should not set "spark.dynamicAllocatio

Re: Dynamic Allocation Backlog Property in Spark on Kubernetes

2021-04-10 Thread ranju goel
Piros *Sent:* Friday, April 9, 2021 11:11 AM *To:* Ranju Jain *Cc:* user@spark.apache.org *Subject:* Re: Dynamic Allocation Backlog Property in Spark on Kubernetes You should not set "spark.dynamicAllocation.schedulerBacklogTimeout" so high and the purpose of this config is very differen

Re: Dynamic Allocation Backlog Property in Spark on Kubernetes

2021-04-08 Thread Attila Zsolt Piros
> > > > Regards > > Ranju > > > > > > *From:* Attila Zsolt Piros > *Sent:* Friday, April 9, 2021 12:13 AM > *To:* Ranju Jain > *Cc:* user@spark.apache.org > *Subject:* Re: Dynamic Allocation Backlog Property in Spark on Kubernetes > > > >

RE: Dynamic Allocation Backlog Property in Spark on Kubernetes

2021-04-08 Thread Ranju Jain
: Ranju Jain Cc: user@spark.apache.org Subject: Re: Dynamic Allocation Backlog Property in Spark on Kubernetes Hi! For dynamic allocation you do not need to run the Spark jobs in parallel. Dynamic allocation simply means Spark scales up by requesting more executors when there are pending tasks

Re: Dynamic Allocation Backlog Property in Spark on Kubernetes

2021-04-08 Thread Attila Zsolt Piros
ynchronize different Spark jobs but it is about tasks. Best regards, Attila On Tue, Apr 6, 2021 at 1:59 PM Ranju Jain wrote: > Hi All, > > > > I have set dynamic allocation enabled while running spark on Kubernetes . > But new executors are requested if pending tasks are backlogge

Dynamic Allocation Backlog Property in Spark on Kubernetes

2021-04-06 Thread Ranju Jain
Hi All, I have set dynamic allocation enabled while running spark on Kubernetes . But new executors are requested if pending tasks are backlogged for more than configured duration in property "spark.dynamicAllocation.schedulerBacklogTimeout". My Use Case is: There are number of par

RE: Spark on Kubernetes | 3.0.1 | Shared Volume or NFS

2021-03-11 Thread Ranju Jain
Ok! Thanks for all guidance :-) Regards Ranju From: Mich Talebzadeh Sent: Thursday, March 11, 2021 11:07 PM To: Ranju Jain Cc: user@spark.apache.org Subject: Re: Spark on Kubernetes | 3.0.1 | Shared Volume or NFS I don't have any specific reference. However, you can do a Google search.

Re: Spark on Kubernetes | 3.0.1 | Shared Volume or NFS

2021-03-11 Thread Mich Talebzadeh
> > > > Do you have any reference or links where I can check out the Shared > Volumes ? > > > > Regards > > Ranju > > > > *From:* Mich Talebzadeh > *Sent:* Thursday, March 11, 2021 5:38 PM > *Cc:* user@spark.apache.org > *Subject:* Re: Spar

RE: Spark on Kubernetes | 3.0.1 | Shared Volume or NFS

2021-03-11 Thread Ranju Jain
: user@spark.apache.org Subject: Re: Spark on Kubernetes | 3.0.1 | Shared Volume or NFS Well your mileage varies so to speak. The only way to find out is setting an NFS mount and testing it. The performance will depend on the mounted file system and the amount of cache it has. File cache is

Re: Spark on Kubernetes | 3.0.1 | Shared Volume or NFS

2021-03-11 Thread Mich Talebzadeh
, March 11, 2021 5:22 PM > *To:* Ranju Jain > *Cc:* user@spark.apache.org > *Subject:* Re: Spark on Kubernetes | 3.0.1 | Shared Volume or NFS > > > > Ok this is on Google Cloud correct? > > > > > > > > > LinkedIn >

RE: Spark on Kubernetes | 3.0.1 | Shared Volume or NFS

2021-03-11 Thread Ranju Jain
the other sides [drawback]. Regards Ranju From: Mich Talebzadeh Sent: Thursday, March 11, 2021 5:22 PM To: Ranju Jain Cc: user@spark.apache.org Subject: Re: Spark on Kubernetes | 3.0.1 | Shared Volume or NFS Ok this is on Google Cloud correct? LinkedIn https://www.linkedin.com/profile

Re: Spark on Kubernetes | 3.0.1 | Shared Volume or NFS

2021-03-11 Thread Mich Talebzadeh
Ok this is on Google Cloud correct? LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * *Disclaimer:* Use it at your own risk. Any and all responsibility for any los

Spark on Kubernetes | 3.0.1 | Shared Volume or NFS

2021-03-11 Thread Ranju Jain
Hi, I need to write all Executors pods data on some common location which can be accessed and retrieved by driver pod. I was first planning to go with NFS, but I think Shared Volume is equally good. Please suggest Is there any major drawback in using Shared Volume instead of NFS when many pods

Re: vm.swappiness value for Spark on Kubernetes

2021-02-16 Thread Sean Owen
You probably don't want swapping in any environment. Some tasks will grind to a halt under mem pressure rather than just fail quickly. You would want to simply provision more memory. On Tue, Feb 16, 2021, 7:57 AM Jahar Tyagi wrote: > Hi, > > We have recently migrated from Spark 2.4.4 to Spark 3.

vm.swappiness value for Spark on Kubernetes

2021-02-16 Thread Jahar Tyagi
Hi, We have recently migrated from Spark 2.4.4 to Spark 3.0.1 and using Spark in virtual machine/bare metal as standalone deployment and as kubernetes deployment as well. There is a kernel parameter named as 'vm.swappiness' and we keep its value as '1' in standard deployment. Now since we are mov

[Spark on Kubernetes] Spark Application dependency management Question.

2021-02-03 Thread xgong
Hey Team: Currently, we were upgrading the spark version from 2.4 to 3.0. But we found that the applications, which work in spark 2.4, keep failing with Spark 3.0. We are running Spark on Kubernetes with cluster mode. In spark-submit, we have "--jars local:///apps-dep/spark-extra-jars/*

RE: Spark on Kubernetes : unable to write files to HDFS

2020-12-16 Thread Loic DESCOTTE
Everything is working fine now 🙂 Thanks again Loïc De : German Schiavon Envoyé : mercredi 16 décembre 2020 19:23 À : Loic DESCOTTE Cc : user@spark.apache.org Objet : Re: Spark on Kubernetes : unable to write files to HDFS We all been there! no reason to be

Re: Spark on Kubernetes : unable to write files to HDFS

2020-12-16 Thread German Schiavon
6 décembre 2020 18:01 > *À :* Loic DESCOTTE > *Cc :* user@spark.apache.org > *Objet :* Re: Spark on Kubernetes : unable to write files to HDFS > > Hi, > > seems that you have a typo no? > > Exception in thread "main" java.io.IOException: No FileSystem for scheme

RE: Spark on Kubernetes : unable to write files to HDFS

2020-12-16 Thread Loic DESCOTTE
Oh thank you you're right!! I feel shameful ?? De : German Schiavon Envoyé : mercredi 16 décembre 2020 18:01 À : Loic DESCOTTE Cc : user@spark.apache.org Objet : Re: Spark on Kubernetes : unable to write files to HDFS Hi, seems that you have a ty

Re: Spark on Kubernetes : unable to write files to HDFS

2020-12-16 Thread German Schiavon
t; .appName("Hello Spark 7") > .config("fs.hdfs.impl", classOf[org.apache.hadoop.hdfs. > DistributedFileSystem].getName) > .getOrCreate() > > > But still the same error... > > ---------- > *De :* Sean Ow

RE: Spark on Kubernetes : unable to write files to HDFS

2020-12-16 Thread Loic DESCOTTE
hdfs.impl", classOf[org.apache.hadoop.hdfs.DistributedFileSystem].getName) .getOrCreate() But still the same error... De : Sean Owen Envoyé : mercredi 16 décembre 2020 14:27 À : Loic DESCOTTE Objet : Re: Spark on Kubernetes : unable to write files to HDFS

Spark on Kubernetes : unable to write files to HDFS

2020-12-16 Thread Loic DESCOTTE
Hello, I am using Spark On Kubernetes and I have the following error when I try to write data on HDFS : "no filesystem for scheme hdfs" More details : I am submitting my application with Spark submit like this : spark-submit --master k8s://https://myK8SMaster:6443 \ --deploy-mo

Spark on Kubernetes

2020-11-13 Thread Arti Pande
Hi, Is it recommended to use Spark on K8S in production? Spark operator for Kubernetes seems to be in beta state. https://github.com/GoogleCloudPlatform/spark-on-k8s-operator#:~:text=The%20Kubernetes%20Operator%20for%20Apache%20Spark%20aims%20to%20make%20specifying,surfacing%20status%20of%20Spark

spark on kubernetes client mode

2020-06-30 Thread Pradeepta Choudhury
Hii team, I am working on spark on kubernetes and was working on a scenario where i need to use spark on kubernetes in client mode from jupyter notebook from two different kubernetes cluster . Is it possible in client mode to spin up driver in one k8 cluster and executors in another k8 cluster

Spark on kubernetes memory spike and spark.kubernetes.memoryOverheadFactor not working

2020-05-27 Thread Maiti, Mousam
Hi Team, We are using spark on Kubernetes, through spark-on-k8s-operator. Our application deals with multiple updateStateByKey operations. Upon investigation, we found that the spark application consumes a higher volume of memory. As spark-on-k8s-operator doesn't give the option to segr

Re: dynamic executor scalling spark on kubernetes client mode

2020-05-12 Thread Steven Stetzler
tand the barriers to getting dynamic executor >> scaling to work in client mode on Kubernetes? >> >> Thanks, >> Steven >> >> On Sat, May 9, 2020 at 9:48 AM Pradeepta Choudhury < >> pradeeptachoudhu...@gmail.com> wrote: >> >>> Hiii , >>&

Dependency management using https in spark on kubernetes

2020-05-12 Thread Pradeepta Choudhury
Hey guys , I have an external api from which i can download the main jar from . when i do a spark-submit ...all confs...https:api.call.com/somefile.jar . it gives an error file already exist in the tmp directory and file content doesn't match error . how can i fix this? Do i need to use an kubernet

Re: dynamic executor scalling spark on kubernetes client mode

2020-05-12 Thread Pradeepta Choudhury
y 9, 2020 at 9:48 AM Pradeepta Choudhury < > pradeeptachoudhu...@gmail.com> wrote: > >> Hiii , >> >> The dynamic executor scalling is working fine for spark on kubernetes >> (latest from spark master repository ) in cluster mode . is the dynamic >> executor scal

Re: dynamic executor scalling spark on kubernetes client mode

2020-05-12 Thread Roland Johann
k in client mode on Kubernetes? > > Thanks, > Steven > > On Sat, May 9, 2020 at 9:48 AM Pradeepta Choudhury > mailto:pradeeptachoudhu...@gmail.com>> wrote: > Hiii , > > The dynamic executor scalling is working fine for spark on kubernetes (latest > from spark master

Re: dynamic executor scalling spark on kubernetes client mode

2020-05-12 Thread Steven Stetzler
? Thanks, Steven On Sat, May 9, 2020 at 9:48 AM Pradeepta Choudhury < pradeeptachoudhu...@gmail.com> wrote: > Hiii , > > The dynamic executor scalling is working fine for spark on kubernetes > (latest from spark master repository ) in cluster mode . is the dynamic > executor s

dynamic executor scalling spark on kubernetes client mode

2020-05-09 Thread Pradeepta Choudhury
Hiii , The dynamic executor scalling is working fine for spark on kubernetes (latest from spark master repository ) in cluster mode . is the dynamic executor scalling available for client mode ? if yes where can i find the usage doc for same . If no is there any PR open for this ? Thanks

Re: Spark on kubernetes : missing spark.kubernetes.driver.request.cores parameter ?

2019-10-04 Thread jcdauchy
I am actually answering myself as I have check on master 3.x branch, and there is this feature ! https://issues.apache.org/jira/browse/SPARK-27754 So my understanding was correct. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ -

Spark on kubernetes : missing spark.kubernetes.driver.request.cores parameter ?

2019-10-04 Thread jcdauchy
Hello all, I am surprised that it is not possible to define "spark.kubernetes.driver.request.cores" when submitting a spark job on kubernetes. My understanding is that it would limit the CPU requests for the driver on the k8s cluster and we could still define how many cores (threads) we use in th

RE: Spark on Kubernetes - log4j.properties not read

2019-06-11 Thread Dave Jaffe
That did the trick, Abhishek! Thanks for the explanation, that answered a lot of questions I had. Dave -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.o

RE: Spark on Kubernetes - log4j.properties not read

2019-06-10 Thread Rao, Abhishek (Nokia - IN/Bangalore)
case. You could try to build the container by placing the log4j.properties at some other location and set the same in spark.driver.extraJavaOptions Thanks and Regards, Abhishek From: Dave Jaffe Sent: Tuesday, June 11, 2019 6:45 AM To: user@spark.apache.org Subject: Spark on Kubernetes

Spark on Kubernetes - log4j.properties not read

2019-06-10 Thread Dave Jaffe
I am using Spark on Kubernetes from Spark 2.4.3. I have created a log4j.properties file in my local spark/conf directory and modified it so that the console (or, in the case of Kubernetes, the log) only shows warnings and higher (log4j.rootCategory=WARN, console). I then added the command COPY

Spark on Kubernetes Authentication error

2019-06-06 Thread Nick Dawes
Hi there, I'm trying to run Spark on EKS. Created an EKS cluster, added nodes and then trying to submit a Spark job from an EC2 instance. Ran following commands for access. kubectl create serviceaccount spark kubectl create clusterrolebinding spark-role --clusterrole=admin --serviceaccount=defaul

  1   2   >