Re: How to make Spark merge the output file?

Matei Zaharia Mon, 06 Jan 2014 21:09:36 -0800

Unfortunately this is expensive to do on HDFS — you’d need a single writer to 
write the whole file. If your file is small enough for that, you can use 
coalesce() on the RDD to bring all the data to one node, and then save it. 
However most HDFS applications work with directories containing multiple files 
instead of single files for this reason.


Matei

On Jan 6, 2014, at 10:56 PM, Nan Zhu <[email protected]> wrote:

> Hi, all
> 
> maybe a stupid question, but is there any way to make Spark write a single 
> file instead of partitioned files?
> 
> Best,
> 
> -- 
> Nan Zhu
>

Re: How to make Spark merge the output file?

Reply via email to