gt; val rdd=sc.parallelize(distinct_tweets_op)
>> rdd.saveAsTextFile("/home/cloudera/bdp/op")
>> val textFile=sc.textFile("/home/cloudera/bdp/op/part-0")
>> val counts=textFile.flatMap(line => line.split(" ")).map(word =>
>> (word,1)).redu
p/op/part-0")
val counts=textFile.flatMap(line => line.split(" ")).map(word =>
(word,1)).reduceByKey(_+_)
counts.SaveAsTextFile("/home/cloudera/bdp/wordcount")
I don't want to write to file instead want to collect in a rdd and apply
filter function on top of schema rdd, is th