Hi All,

I have the following question.

Imagine there is a DStream of JSON strings coming in and I apply few
different filters in parallel on the same DStream (so these filters are not
applied one after the other). For Example here is the Pseudo code if that
helps

dstream.filter(x -> { check for certain set of keys }) -> filteredStream1

dstream.filter(x -> { check for another set of keys}) -> filteredStream2

but I cannot do dstream.filter(x -> { check for certain set of keys
}).filter(x -> { check for another set of keys})

now I want to be able to count number of elements in  filteredStream1 and
filteredStream2 and combine the result into one message as follows

{"filteredStream1" : 50,  "filteredStream2": 25}

Any easy way to do this such as leveraging rdd.count across streams or
should I use mapToPair and reduceByKey?

Thanks!

Reply via email to