Re: Spark decommission

2024-07-04 Thread Arun Ravi
Hi Rajesh, We use it production at scale. We run spark on kubernetes on aws cloud and here are the key things that we do 1) we run driver on on-demand node 2) we have configured decommission along with fallback option on to S3, try the latest single zone S3 for this. 3) We use pvc aware scheduling

Re: External Spark shuffle service for k8s

2024-04-10 Thread Arun Ravi
Hi Everyone, I had to explored IBM's and AWS's S3 shuffle plugins (some time back), I had also explored AWS FSX lustre in few of my production jobs which has ~20TB of shuffle operations with 200-300 executors. What I have observed is S3 and fax behaviour was fine during the write phase, however I

Re: Clarification on ExecutorRoll Plugin & Ignore Decommission Fetch Failure

2023-08-25 Thread Arun Ravi
Thank you once again. Arun Ravi M V B.Tech (Batch: 2010-2014) Computer Science and Engineering Govt. Model Engineering College Cochin University Of Science And Technology Kochi On Sat, 26 Aug 2023 at 05:49, Dongjoon Hyun wrote: > Hi, Arun. > > Here are some answers to your questi

Clarification on ExecutorRoll Plugin & Ignore Decommission Fetch Failure

2023-08-25 Thread Arun Ravi
know how I should be using Executor Rolling, without triggering stage failures? I am using executor rolling to avoid executors being removed by K8s due to memory pressure or oom issues as my spark job is heavy on shuffling and has a lot of window functions. Any help will be super useful. Arun Ravi M V

Re: KubernetesLocalDiskShuffleDataIO mount path dependency doubt.

2023-08-11 Thread Arun Ravi
Hi Dongjoon, Thank you for sharing about Old Protocol and clearing my doubt. I was able to understand the difference between Spark 2 & 3. For now `KubernetesLocalDiskShuffleDataIO` works fine for me. Thanks, Arun Ravi M V B.Tech (Batch: 2010-2014) Computer Science and Engineering Govt. M

KubernetesLocalDiskShuffleDataIO mount path dependency doubt.

2023-08-11 Thread Arun Ravi
ull/42417>. Is it to avoid other folders in the volume ? Also, does this mean the path should use executor ID and spark app id or just hardcoded spark-x/executor-x/? Sorry, I couldn't fully understand the reasoning for this. Any help will be super useful. Arun Ravi M V B.Tech (Batch

Discussing the idea of Shared Volume block store client

2023-04-23 Thread Arun Ravi
s external shuffle servers Thanks in advance for all the feedback and suggestions. Arun Ravi M V B.Tech (Batch: 2010-2014) Computer Science and Engineering Govt. Model Engineering College Cochin University Of Science And Technology Kochi arunrav...@gmail.com +91 9995354581 Skype : arunravimv