Re: Autoscaling in Spark

2023-10-10 Thread Mich Talebzadeh
This has been brought up a few times. I will focus on Spark Structured Streaming Autoscaling does not support Spark Structured Streaming (SSS). Why because streaming jobs are typically long-running jobs that need to maintain state across micro-batches. Autoscaling is designed to scale up and down

Autoscaling in Spark

2023-10-10 Thread Kiran Biswal
Hello Experts Is there any true auto scaling option for spark? The dynamic auto scaling works only for batch. Any guidelines on spark streaming autoscaling and how that will be tied to any cluster level autoscaling solutions? Thanks

Re: Updating delta file column data

2023-10-10 Thread Mich Talebzadeh
Hi, Since you mentioned that there could be duplicate records with the same unique key in the Delta table, you will need a way to handle these duplicate records. One approach I can suggest is to use a timestamp to determine the latest or most relevant record among duplicates, the so-called