Is you batch size 30 seconds by any chance?

Assuming not, please check whether you are creating the streaming context
with master "local[n]" where n > 2. With "local" or "local[1]", the system
only has one processing slot, which is occupied by the receiver leaving no
room for processing the received data. It could be that after 30 seconds,
the server disconnects, the receiver terminates, releasing the single slot
for the processing to proceed.

TD


On Tue, Apr 29, 2014 at 2:28 PM, Eduardo Costa Alfaia <
e.costaalf...@unibs.it> wrote:

> Hi TD,
>
> In my tests with spark streaming, I'm using JavaNetworkWordCount(modified)
> code and a program that I wrote that sends words to the Spark worker, I use
> TCP as transport. I verified that after starting Spark, it connects to my
> source which actually starts sending, but the first word count is
> advertised approximately 30 seconds after the context creation. So I'm
> wondering where is stored the 30 seconds data already sent by the source.
> Is this a normal spark’s behaviour? I saw the same behaviour using the
> shipped JavaNetworkWordCount application.
>
> Many thanks.
> --
> Informativa sulla Privacy: http://www.unibs.it/node/8155
>

Reply via email to