Hi, The reason I am looking to do it differently is because the latency and batch processing times are bad about 40 sec. I took the times from the Streaming UI.
As you suggested I tried the window as below and still the times are bad. val dStream = KafkaUtils.createStream(ssc, zkQuorum, group, topicpMap) val eventData = dStream.map(_._2).map(_.split(",")).map(data => Data(data(0), data(1), data(2), data(3), data(4))).window(Minutes(15), Seconds(3)) val result = eventData.transform((rdd, time) => { rdd.registerAsTable("data") sql("SELECT count(state) FROM data WHERE state='Active'") }) result.print() Any suggestions? Ali -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-reduceByWindow-reduceFunc-invReduceFunc-windowDuration-slideDuration-tp11591p11612.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org