I've been experimenting with my configuration for couple of days and gained
quite a bit of power through small optimizations, but it may very well be
something I'm doing crazy that is causing this problem.
To give a little bit of a background, I am in the early stages of a project
that consumes a
Hi,
Please correct me if I'm wrong, in Spark Streaming, next batch will
not start processing until the previous batch has completed. Is there
any way to be able to start processing the next batch if the previous
batch is taking longer to process than the batch interval?
The problem I am facing
So you have come across spark.streaming.concurrentJobs already :)
Yeah, that is an undocumented feature that does allow multiple output
operations to submitted in parallel. However, this is not made public for
the exact reasons that you realized - the semantics in case of stateful
operations is