Accounting the impact of failures in spark jobs

2024-04-19 Thread Faiz Halde
Hello, In my organization, we have an accounting system for spark jobs that uses the task execution time to determine how much time a spark job uses the executors for and we use it as a way to segregate cost. We sum all the task times per job and apply proportions. Our clusters follow a 1 task

Re: Spark on Java 17

2023-12-09 Thread Faiz Halde
n.ch/node/192 > > > > Best, > > Luca > > > > > > *From:* Faiz Halde > *Sent:* Thursday, December 7, 2023 23:25 > *To:* user@spark.apache.org > *Subject:* Spark on Java 17 > > > > Hello, > > > > We are planning to switch to Java 17 for

Spark on Java 17

2023-12-07 Thread Faiz Halde
Hello, We are planning to switch to Java 17 for Spark and were wondering if there's any obvious learnings from anybody related to JVM tuning? We've been running on Java 8 for a while now and used to use the parallel GC as that used to be a general recommendation for high throughout systems. How

ML using Spark Connect

2023-12-01 Thread Faiz Halde
Hello, Is it possible to run SparkML using Spark Connect 3.5.0? So far I've had no success setting up a connect client that uses ML package The ML package uses spark core/sql afaik which seems to be shadowing the Spark connect client classes Do I have to exclude any dependencies from the mllib

Re: Classpath isolation per SparkSession without Spark Connect

2023-11-28 Thread Faiz Halde
12:47, Holden Karau wrote: > >> So I don’t think we make any particular guarantees around class path >> isolation there, so even if it does work it’s something you’d need to pay >> attention to on upgrades. Class path isolation is tricky to get right. >> >> On Mon, Nov 27,

Re: Classpath isolation per SparkSession without Spark Connect

2023-11-27 Thread Faiz Halde
Holden Karau wrote: > So I don’t think we make any particular guarantees around class path > isolation there, so even if it does work it’s something you’d need to pay > attention to on upgrades. Class path isolation is tricky to get right. > > On Mon, Nov 27, 2023 at 2:58 PM Fa

Classpath isolation per SparkSession without Spark Connect

2023-11-27 Thread Faiz Halde
Hello, We are using spark 3.5.0 and were wondering if the following is achievable using spark-core Our use case involves spinning up a spark cluster where the driver application loads user jars containing spark transformations at runtime. A single spark application can load multiple user jars (

[Spark Core]: Recomputation cost of a job due to executor failures

2023-10-04 Thread Faiz Halde
Hello, Due to the way Spark implements shuffle, a loss of an executor sometimes results in the recomputation of partitions that were lost The definition of a *partition* is the tuple ( RDD-ids, partition id ) RDD-ids is a sequence of RDD ids In our system, we define the unit of work performed

[spark-core] Can executors recover/reuse shuffle files upon failure?

2023-05-15 Thread Faiz Halde
Hello, We've been in touch with a few spark specialists who suggested us a potential solution to improve the reliability of our jobs that are shuffle heavy Here is what our setup looks like - Spark version: 3.3.1 - Java version: 1.8 - We do not use external shuffle service - We use

[SparkListener] Calculating the total amount of re-computations / waste

2022-10-14 Thread Faiz Halde
Hello, We run our spark workloads on spot and we would like to quantify the impact of spot interruptions on our workloads. We are proposing the following metric but would like your opinions on it We are leveraging Spark's Event Listener and performing the following T = task T1 =