hi I am new to spark and scala and I am trying to do some aggregations on json file stream using Spark Streaming. I am able to parse the json string and it is converted to map(id -> 123, name -> srini, mobile -> 12324214, score -> 123, test_type -> math) now i want to use GROUPBY function on each student map data and wanted to do some aggregations on scores. Here is my main function val Array(zkQuorum, group, topics, numThreads) = args val sparkConf = new SparkConf().setAppName("KafkaWordCount") val ssc = new StreamingContext(sparkConf, Seconds(10)) // ssc.checkpoint("checkpoint")
val topicpMap = topics.split(",").map((_,numThreads.toInt)).toMap val lines = KafkaUtils.createStream(ssc, zkQuorum, group, topicpMap).map(_._2) val jsonf = lines.map(JSON.parseFull(_)).map(_.get.asInstanceOf[scala.collection.immutable.Map[String, Any]]) jsonf.print() ssc.start() ssc.awaitTermination() } Can anyone please Let me know how to use groupby function..thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Json-file-groupby-function-tp9618.html Sent from the Apache Spark User List mailing list archive at Nabble.com.