Re: Interesting Stateful Streaming question
This does sound like a good use case for that feature. Note that Spark 2.2. adds a similar [flat]MapGroupsWithState operation to structured streaming. Stay tuned for a blog post on that! On Thu, Jun 29, 2017 at 6:11 PM, kant kodaliwrote: > Is mapWithState an answer for this ? https://databricks.com/blog/ > 2016/02/01/faster-stateful-stream-processing-in-apache- > spark-streaming.html > > On Thu, Jun 29, 2017 at 11:55 AM, kant kodali wrote: > >> Hi All, >> >> Here is a problem and I am wondering if Spark Streaming is the right tool >> for this ? >> >> I have stream of messages m1, m2, m3and each of those messages can be >> in state s1, s2, s3,sn (you can imagine the number of states are about >> 100) and I want to compute some metrics that visit all the states from s1 >> to sn but these state transitions can happen at indefinite amount of >> time. A simple example of that would be count all messages that visited >> state s1, s2, s3. Other words, the transition function should know that say >> message m1 had visited state s1 and s2 but not s3 yet and once the message >> m1 visits s3 increment the counter +=1 . >> >> If it makes anything easier I can say a message has to visit s1 before >> visiting s2 and s2 before visiting s3 and so on but would like to know both >> with and without order. >> >> Thanks! >> >> >
Re: Interesting Stateful Streaming question
Is mapWithState an answer for this ? https://databricks.com/blog/2016/02/01/faster-stateful-stream-processing-in-apache-spark-streaming.html On Thu, Jun 29, 2017 at 11:55 AM, kant kodaliwrote: > Hi All, > > Here is a problem and I am wondering if Spark Streaming is the right tool > for this ? > > I have stream of messages m1, m2, m3and each of those messages can be > in state s1, s2, s3,sn (you can imagine the number of states are about > 100) and I want to compute some metrics that visit all the states from s1 > to sn but these state transitions can happen at indefinite amount of > time. A simple example of that would be count all messages that visited > state s1, s2, s3. Other words, the transition function should know that say > message m1 had visited state s1 and s2 but not s3 yet and once the message > m1 visits s3 increment the counter +=1 . > > If it makes anything easier I can say a message has to visit s1 before > visiting s2 and s2 before visiting s3 and so on but would like to know both > with and without order. > > Thanks! > >
Interesting Stateful Streaming question
Hi All, Here is a problem and I am wondering if Spark Streaming is the right tool for this ? I have stream of messages m1, m2, m3and each of those messages can be in state s1, s2, s3,sn (you can imagine the number of states are about 100) and I want to compute some metrics that visit all the states from s1 to sn but these state transitions can happen at indefinite amount of time. A simple example of that would be count all messages that visited state s1, s2, s3. Other words, the transition function should know that say message m1 had visited state s1 and s2 but not s3 yet and once the message m1 visits s3 increment the counter +=1 . If it makes anything easier I can say a message has to visit s1 before visiting s2 and s2 before visiting s3 and so on but would like to know both with and without order. Thanks!