I agree with what is stated. This is the gist of my understanding having
tested it.
When working with Spark Structured Streaming, each streaming query runs in
its own separate Spark session to ensure isolation and avoid conflicts
between different queries.
So here I have:
def process_data(self, df
hm.
In your logic here
def process_micro_batch(micro_batch_df, batchId) :
micro_batch_df.createOrReplaceTempView("temp_view")
df = spark.sql(f"select * from temp_view")
return df
Is this function called and if so do you check if micro_batch_df contains
rows -> if len(micro_batch_df
Hi,
Streaming query clones the spark session - when you create a temp view from
DataFrame, the temp view is created under the cloned session. You will need
to use micro_batch_df.sparkSession to access the cloned session.
Thanks,
Jungtaek Lim (HeartSaVioR)
On Wed, Jan 31, 2024 at 3:29 PM Karthick