Hello, I am new to Spark and Structured Streaming and have the following File Output Sink question:
Wondering what (and how to modify) triggers a Spark Sturctured Streaming Query (with Parquet File output sink configured) to write data to the parquet files. I periodically feed the Stream input data (using Stream Reader to read in files), but it does not write output to Parquet file for each file provided as input. Once I have given it a few files, it tends to write a Parquet file just fine. I am wondering how to control the threshold to write. I would like to be able force a new write to Parquet file for every new file provided as input (at least for intitial testing). Any tips appreciated! Thanks, Matt