Re: Spark Streaming Json file groupby function

Tathagata Das Mon, 14 Jul 2014 12:26:28 -0700

You have to import StreamingContext._  to enable groupByKey operations on
DStreams. After importing that you can apply groupByKey on any DStream,
that is a DStream of key-value pairs (e.g. DStream[(String, Int)]) . The
data in each pair RDDs will be grouped by the first element in the tuple as
the grouping element.


TD


On Mon, Jul 14, 2014 at 10:59 AM, srinivas <kusamsrini...@gmail.com> wrote:

> hi
>   I am new to spark and scala and I am trying to do some aggregations on
> json file stream using Spark Streaming. I am able to parse the json string
> and it is converted to map(id -> 123, name -> srini, mobile -> 12324214,
> score -> 123, test_type -> math) now i want to use GROUPBY function on each
> student map data and wanted to do some aggregations on scores. Here is my
> main function
> val Array(zkQuorum, group, topics, numThreads) = args
>     val sparkConf = new SparkConf().setAppName("KafkaWordCount")
>     val ssc = new StreamingContext(sparkConf, Seconds(10))
>    // ssc.checkpoint("checkpoint")
>
>     val topicpMap = topics.split(",").map((_,numThreads.toInt)).toMap
>      val lines = KafkaUtils.createStream(ssc, zkQuorum, group,
> topicpMap).map(_._2)
>      val jsonf =
>
> lines.map(JSON.parseFull(_)).map(_.get.asInstanceOf[scala.collection.immutable.Map[String,
> Any]])
>
>
> jsonf.print()
>
>     ssc.start()
>     ssc.awaitTermination()
>   }
>
> Can anyone please Let me know how to use groupby function..thanks
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Json-file-groupby-function-tp9618.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Re: Spark Streaming Json file groupby function

Reply via email to