Hi,
In general in spark stream one can do transformations ( filter, map etc.)
or output operations (collect, forEach) etc. in an event-driven pardigm...
i.e. the action happens only if a message is received.

Is it possible to do actions every few seconds in a polling based fashion,
regardless if a new message has been received.

In my use-case, I am filtering out a stream, and then do operations on this
filter streams. However, I would like to do operations on the data in the
stream every few seconds even if no message has been received in the stream.

For example I have the following stream, where the violationChecking()
function is being called only when a micro-batch is finished. Essentially
this also means that I must receive a message in this stream to do
processing. Is there any way that I can do the same operation every 10
seconds or so? :

    sortedMessages.foreach(
            new Function<JavaRDD<Tuple5<Long, Integer, String, Integer,
Integer>>, Void>() {
                @Override
                public Void call(JavaRDD<Tuple5<Long, Integer, String,
Integer, Integer>> tuple5JavaRDD) throws Exception {
                    List<Tuple5<Long, Integer, String, Integer, Integer>>
list = tuple5JavaRDD.collect();
                    violationChecking(list);
                    return null;
                }
            }
    );


Thanks,

Nipun

Reply via email to