Parquet File Output Sink - Spark Structured Streaming

Matt Kuiper Wed, 27 Mar 2019 08:45:31 -0700

Hello,

I am new to Spark and Structured Streaming and have the following File Output 
Sink question:


Wondering what (and how to modify) triggers a Spark Sturctured Streaming Query 
(with Parquet File output sink configured) to write data to the parquet files.  
I periodically feed the Stream input data (using Stream Reader to read in 
files), but it does not write output to Parquet file for each file provided as 
input.   Once I have given it a few files, it tends to write a Parquet file 
just fine.

I am wondering how to control the threshold to write.  I would like to be able 
force a new write to Parquet file for every new file provided as input (at 
least for intitial testing).   Any tips appreciated!

Thanks,
Matt

Parquet File Output Sink - Spark Structured Streaming

Reply via email to