Hi,

The reason I am looking to do it differently is because the latency and
batch processing times are bad about 40 sec. I took the times from the
Streaming UI.

As you suggested I tried the window as below and still the times are bad.
 val dStream = KafkaUtils.createStream(ssc, zkQuorum, group, topicpMap)
      val eventData = dStream.map(_._2).map(_.split(",")).map(data =>
Data(data(0), data(1), data(2), data(3), data(4))).window(Minutes(15),
Seconds(3))
      
      val result =  eventData.transform((rdd, time) => {
        rdd.registerAsTable("data")
        sql("SELECT count(state) FROM data WHERE state='Active'")
      })
      result.print()
      
Any suggestions?

Ali







--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-reduceByWindow-reduceFunc-invReduceFunc-windowDuration-slideDuration-tp11591p11612.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to