I created a jira ticket for my work in both the spark and 
spark-cassandra-connector JIRAs, I don’t know why you can not see them.
Users can stream from any cassandra table, just as one can stream from a Kafka 
topic; same principle. 

Helena
@helenaedelson

> On Mar 24, 2015, at 11:29 AM, Anwar Rizal <anriza...@gmail.com> wrote:
> 
> Helena,
> 
> The CassandraInputDStream sounds interesting. I dont find many things in the 
> jira though. Do you have more details on what it tries to achieve ?
> 
> Thanks,
> Anwar.
> 
> On Tue, Mar 24, 2015 at 2:39 PM, Helena Edelson <helena.edel...@datastax.com 
> <mailto:helena.edel...@datastax.com>> wrote:
> Streaming _from_ cassandra, CassandraInputDStream, is coming BTW 
> https://issues.apache.org/jira/browse/SPARK-6283 
> <https://issues.apache.org/jira/browse/SPARK-6283>
> I am working on it now.
> 
> Helena
> @helenaedelson
> 
>> On Mar 23, 2015, at 5:22 AM, Khanderao Kand Gmail <khanderao.k...@gmail.com 
>> <mailto:khanderao.k...@gmail.com>> wrote:
>> 
>> Akhil 
>> 
>> You are right in tour answer to what Mohit wrote. However what Mohit seems 
>> to be alluring but did not write properly might be different.
>> 
>> Mohit
>> 
>> You are wrong in saying "generally" streaming works in HDFS and cassandra . 
>> Streaming typically works with streaming or queing source like Kafka, 
>> kinesis, Twitter, flume, zeroMQ, etc (but can also from HDFS and S3 ) 
>> However , streaming context ( "receiver" wishing the streaming context ) 
>> gets events/messages/records and forms a time window based batch (RDD)- 
>> 
>> So there is a maximum gap of window time from alert message was available to 
>> spark and when the processing happens. I think you meant about this. 
>> 
>> As per spark programming model, RDD is the right way to deal with data.  If 
>> you are fine with the minimum delay of say a sec (based on min time window 
>> that dstreaming can support) then what Rohit gave is a right model. 
>> 
>> Khanderao
>> 
>> On Mar 22, 2015, at 11:39 PM, Akhil Das <ak...@sigmoidanalytics.com 
>> <mailto:ak...@sigmoidanalytics.com>> wrote:
>> 
>>> What do you mean you can't send it directly from spark workers? Here's a 
>>> simple approach which you could do:
>>> 
>>>     val data = ssc.textFileStream("sigmoid/")
>>>     val dist = data.filter(_.contains("ERROR")).foreachRDD(rdd => 
>>> alert("Errors :" + rdd.count()))
>>> 
>>> And the alert() function could be anything triggering an email or sending 
>>> an SMS alert.
>>> 
>>> Thanks
>>> Best Regards
>>> 
>>> On Sun, Mar 22, 2015 at 1:52 AM, Mohit Anchlia <mohitanch...@gmail.com 
>>> <mailto:mohitanch...@gmail.com>> wrote:
>>> Is there a module in spark streaming that lets you listen to the 
>>> alerts/conditions as they happen in the streaming module? Generally spark 
>>> streaming components will execute on large set of clusters like hdfs or 
>>> Cassandra, however when it comes to alerting you generally can't send it 
>>> directly from the spark workers, which means you need a way to listen to 
>>> the alerts.
>>> 
> 
> 

Reply via email to