Use counts.repartition(1).save...... Hth On Oct 20, 2017 3:01 PM, "Uğur Sopaoğlu" <usopao...@gmail.com> wrote:
Actually, when I run following code, val textFile = sc.textFile("Sample.txt") val counts = textFile.flatMap(line => line.split(" ")) .map(word => (word, 1)) .reduceByKey(_ + _) It save the results into more than one partition like part-00000, part-00001. I want to collect all of them into one file. 2017-10-20 16:43 GMT+03:00 Marco Mistroni <mmistr...@gmail.com>: > Hi > Could you just create an rdd/df out of what you want to save and store it > in hdfs? > Hth > > On Oct 20, 2017 9:44 AM, "Uğur Sopaoğlu" <usopao...@gmail.com> wrote: > >> Hi all, >> >> In word count example, >> >> val textFile = sc.textFile("Sample.txt") >> val counts = textFile.flatMap(line => line.split(" ")) >> .map(word => (word, 1)) >> .reduceByKey(_ + _) >> counts.saveAsTextFile("hdfs://master:8020/user/abc") >> >> I want to write collection of "*counts" *which is used in code above to >> HDFS, so >> >> val x = counts.collect() >> >> Actually I want to write *x *to HDFS. But spark wants to RDD to write >> sometihng to HDFS >> >> How can I write Array[(String,Int)] to HDFS >> >> >> -- >> Uğur >> > -- Uğur Sopaoğlu