Re: Feature to restart Spark job from previous failure point

2023-09-05 Thread Mich Talebzadeh
Hi Dipayan, You ought to maintain data source consistency minimising changes. upstream. Spark is not a Swiss Army knife :) Anyhow, we already do this in spark structured streaming with the concept of checkpointing.You can do so by implementing - Checkpointing - Stateful processing in

Feature to restart Spark job from previous failure point

2023-09-04 Thread Dipayan Dev
Hi Team, One of the biggest pain points we're facing is when Spark reads upstream partition data and during Action, the upstream also gets refreshed and the application fails with 'File not exists' error. It could happen that the job has already spent a reasonable amount of time, and re-running