Why don't you just map rdd's rows to lines and then call saveAsTextFile()?
On 3.2.2015. 11:15, Hafiz Mujadid wrote:
I want to write whole schemardd to single in hdfs but facing following exception rg.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /test/data/data1.csv (inode 402042): File does not exist. Holder DFSClient_NONMAPREDUCE_-564238432_57 does not have any open files here is my code rdd.foreachPartition( iterator => { var output = new Path( outputpath ) val fs = FileSystem.get( new Configuration() ) var writer : BufferedWriter = null writer = new BufferedWriter( new OutputStreamWriter( fs.create( output ) ) ) var line = new StringBuilder iterator.foreach( row => { row.foreach( column => { line.append( column.toString + splitter ) } ) writer.write( line.toString.dropRight( 1 ) ) writer.newLine() line.clear } ) writer.close() } ) I think problem is that I am making writer for each partition and multiple writer are executing in parallel so when they try to write to same file then this problem appears. When I avoid this approach then I face task not serializable exception Any suggest to handle this problem? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/LeaseExpiredException-while-writing-schemardd-to-hdfs-tp21477.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
--------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org