StreamingContext does not stop

Tobias Pfeiffer Thu, 13 Nov 2014 00:12:12 -0800

Hi,

I am processing a bunch of HDFS data using the StreamingContext (Spark
1.1.0) which means that all files that exist in the directory at start()
time are processed in the first batch. Now when I try to stop this stream
processing using `streamingContext.stop(false, false)` (that is, even with
stopGracefully = false), it has no effect. The stop() call blocks and data
processing continues (probably it would stop after the batch, but that
would be too long since all my data is in that batch).


I am not exactly sure if this is generally true or only for the first
batch. Also I observed that stopping the stream processing during the first
batch does occasionally lead to a very long time until the stop takes place
(even if there is no data present at all).

Has anyone experienced something similar? In my processing code, do I have
to do something particular (like checking for the state of the
StreamingContext) to allow the interruption? It is quite important for me
that stopping the stream processing takes place rather quickly.

Thanks
Tobias

StreamingContext does not stop

Reply via email to