Global Variables in Spark Streaming

2014-09-10 Thread Ravi Sharma
Hi Friends,

I'm using spark streaming as for kafka consumer. I want to do the CEP by
spark. So as for that I need to store my sequence of events. so that I cant
detect some pattern.

My question is How can I save my events in java collection temporary , So
that i can detect pattern by *processed(temporary stored) and upcoming
events.*


Cheers,
Ravi Sharma


Re: Global Variables in Spark Streaming

2014-09-10 Thread Akhil Das
Yes your understanding is correct. In that case one easiest option would be
to Serialize the object and dump it somewhere in hdfs so that you will be
able to recreate/update the object from the file.

We have something similar which you can find over BroadCastServer
https://github.com/sigmoidanalytics/spork/blob/spork-0.9/src/org/apache/pig/backend/hadoop/executionengine/spark/BroadCastServer.java
and BroadCastClient
https://github.com/sigmoidanalytics/spork/blob/spork-0.9/src/org/apache/pig/backend/hadoop/executionengine/spark/BroadCastClient.java
which we use internally to pass/update Objects between master and worker
nodes.

Thanks
Best Regards

On Wed, Sep 10, 2014 at 7:50 PM, Ravi Sharma raviprincesha...@gmail.com
wrote:

 Akhil, By using broadcast variable Will I be able to change the values of
 Broadcast variable?
 As per my understanding It will create final variable to access the value
 across the cluster.

 Please correct me if I'm wrong.

 Thanks,

 Cheers,
 Ravi Sharma

 On Wed, Sep 10, 2014 at 7:31 PM, Akhil Das ak...@sigmoidanalytics.com
 wrote:

 Have a look at Broadcasting variables
 http://spark.apache.org/docs/latest/programming-guide.html#broadcast-variables


 Thanks
 Best Regards

 On Wed, Sep 10, 2014 at 7:25 PM, Ravi Sharma raviprincesha...@gmail.com
 wrote:

 Hi Friends,

 I'm using spark streaming as for kafka consumer. I want to do the CEP by
 spark. So as for that I need to store my sequence of events. so that I cant
 detect some pattern.

 My question is How can I save my events in java collection temporary ,
 So that i can detect pattern by *processed(temporary stored) and
 upcoming events.*


 Cheers,
 Ravi Sharma






Re: Global Variables in Spark Streaming

2014-09-10 Thread Santiago Mola
Hi Ravi,

2014-09-10 15:55 GMT+02:00 Ravi Sharma raviprincesha...@gmail.com:


 I'm using spark streaming as for kafka consumer. I want to do the CEP by
spark. So as for that I need to store my sequence of events. so that I cant
detect some pattern.


Depending on what you're trying to accomplish, you might implement this
using Spark Streaming only, by using the updateStateByKey transformation.
[1]

This will allow you to maintain global states that you can combine with
other streaming operations. We have successfully used this approach to
detect patterns in log sequences with Spark Streaming.

[1]
http://spark.apache.org/docs/latest/streaming-programming-guide.html#transformations

Best,
-- 
Santiago M. Mola
sm...@stratio.com