I have a use case where I am joining a streamingDataFrame with a static
DataFrame. The static DataFrame is read from a parquet table (a directory
containing parquet files). This parquet data is updated by another process
once a day. I am using structured streaming for the streaming DataFrame.

My question is what would happen to my static DataFrame?

   -

   Would it update itself because of the lazy execution or is there some
   weird caching behavior that can prevent this?
   -

   Can the updation process make my code crash?
   -

   Would it be possible to force the DataFrame to update itself once a day
   in any way?

I am working with Spark 2.3.2

Reply via email to