Hi, I am trying to persist the DStreams to text files. When I use the inbuilt API 'saveAsTextFiles' as :
stream.saveAsTextFiles(resultDirectory) this creates a number of subdirectories, for each batch, and within each sub directory, it creates bunch of text files for each RDD (I assume). I am wondering if I can have single text files for each batch. Is there any API for that ? Or else, a single output file for the entire stream ? I tried to manually write from each RDD stream to a text file as : stream.foreachRDD(rdd =>{ rdd.foreach(element => { fileWriter.write(element) }) }) where 'fileWriter' simply makes use of a Java BufferedWriter to write strings to a file. However, this fails with exception : DStreamCheckpointData.writeObject used java.io.BufferedWriter java.io.NotSerializableException: java.io.BufferedWriter at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1183) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547) ..... Any help on how to proceed with this ? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Persist-streams-to-text-files-tp2986.html Sent from the Apache Spark User List mailing list archive at Nabble.com.