Hi , You can use FileUtil.copemerge API and specify the path to the folder where saveAsTextFile is save the part text file.
Suppose your directory is /a/b/c/ use FileUtil.copeMerge(FileSystem of source, a/b/c, FileSystem of destination, Path to the merged file say (a/b/c.txt), true(to delete the original dir,null)) Thanks. On Fri, Nov 21, 2014 at 11:31 AM, Jishnu Prathap [via Apache Spark User List] <ml-node+s1001560n19449...@n3.nabble.com> wrote: > Hi I am also having similar problem.. any fix suggested.. > > > > *Originally Posted by GaganBM* > > Hi, > > I am trying to persist the DStreams to text files. When I use the inbuilt > API 'saveAsTextFiles' as : > > stream.saveAsTextFiles(resultDirectory) > > this creates a number of subdirectories, for each batch, and within each > sub directory, it creates bunch of text files for each RDD (I assume). > > I am wondering if I can have single text files for each batch. Is there > any API for that ? Or else, a single output file for the entire stream ? > > I tried to manually write from each RDD stream to a text file as : > > stream.foreachRDD(rdd =>{ > rdd.foreach(element => { > fileWriter.write(element) > }) > }) > > where 'fileWriter' simply makes use of a Java BufferedWriter to write > strings to a file. However, this fails with exception : > > DStreamCheckpointData.writeObject used > java.io.BufferedWriter > java.io.NotSerializableException: java.io.BufferedWriter > at > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1183) > at > java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547) > ..... > > Any help on how to proceed with this ? > > The information contained in this electronic message and any attachments > to this message are intended for the exclusive use of the addressee(s) and > may contain proprietary, confidential or privileged information. If you are > not the intended recipient, you should not disseminate, distribute or copy > this e-mail. Please notify the sender immediately and destroy all copies of > this message and any attachments. > > WARNING: Computer viruses can be transmitted via email. The recipient > should check this email and any attachments for the presence of viruses. > The company accepts no liability for any damage caused by any virus > transmitted by this email. > > www.wipro.com > > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > > http://apache-spark-user-list.1001560.n3.nabble.com/Re-Persist-streams-to-text-files-tp19449.html > To start a new topic under Apache Spark User List, email > ml-node+s1001560n1...@n3.nabble.com > To unsubscribe from Apache Spark User List, click here > <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=cHJhbm5veUBzaWdtb2lkYW5hbHl0aWNzLmNvbXwxfC0xNTI2NTg4NjQ2> > . > NAML > <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Re-Persist-streams-to-text-files-tp19449p19457.html Sent from the Apache Spark User List mailing list archive at Nabble.com.