Is mapWithState an answer for this ?
https://databricks.com/blog/2016/02/01/faster-stateful-stream-processing-in-apache-spark-streaming.html

On Thu, Jun 29, 2017 at 11:55 AM, kant kodali <kanth...@gmail.com> wrote:

> Hi All,
>
> Here is a problem and I am wondering if Spark Streaming is the right tool
> for this ?
>
> I have stream of messages m1, m2, m3....and each of those messages can be
> in state s1, s2, s3,....sn (you can imagine the number of states are about
> 100) and I want to compute some metrics that visit all the states from s1
> to sn but these state transitions can happen at indefinite amount of
> time. A simple example of that would be count all messages that visited
> state s1, s2, s3. Other words, the transition function should know that say
> message m1 had visited state s1 and s2 but not s3 yet and once the message
> m1 visits s3 increment the counter +=1 .
>
> If it makes anything easier I can say a message has to visit s1 before
> visiting s2 and s2 before visiting s3 and so on but would like to know both
> with and without order.
>
> Thanks!
>
>

Reply via email to