Responses inline.
On Wed, Aug 28, 2019 at 8:42 AM Nick Dawes wrote:
> Thank you, TD. Couple of follow up questions please.
>
> 1) "It only keeps around the minimal intermediate state data"
>
> How do you define "minimal" here? Is there a configuration property to
> control the time or size of St
Thank you, TD. Couple of follow up questions please.
1) "It only keeps around the minimal intermediate state data"
How do you define "minimal" here? Is there a configuration property to
control the time or size of Streaming Dataframe?
2) I'm not writing anything out to any database or S3. My req
https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#basic-concepts
*Note that Structured Streaming does not materialize the entire table*. It
> reads the latest available data from the streaming data source, processes
> it incrementally to update the result, and then d
I have a quick newbie question.
Spark Structured Streaming creates an unbounded dataframe that keeps
appending rows to it.
So what's the max size of data it can hold? What if the size becomes bigger
than the JVM? Will it spill to disk? I'm using S3 as storage. So will it
write temp data on S3 or