Given this program.. I have the following queries..
val sparkConf = new SparkConf().setAppName("NetworkWordCount")
sparkConf.set("spark.master", "local[2]")
val ssc = new StreamingContext(sparkConf, Seconds(10))
val lines = ssc.socketTextStream(args(0), args(1).toInt, StorageLevel.
MEMORY_AND_DISK_SER)
val words = lines.flatMap(_.split(" "))
val wordCounts = words.map(x => (x, 1)).reduceByKey(_ + _)
wordCounts.print()
ssc.start()
ssc.awaitTermination()
Q1) How do I know which part of the program is executing every 10 sec..
My requirements is that, I want to execute a method and insert data into
Cassandra every time a set of messages comes in
Q2) Is there a function I can pass, so that, it gets executed when the next
set of messages comes in.
Q3) If I have a method in-beween the following lines
val wordCounts = words.map(x => (x, 1)).reduceByKey(_ + _)
wordCounts.print()
my_method(stread rdd)..
ssc.start()
The method is not getting executed..
Can some one answer these questions?
--Reddy