subject:"How do you write a JavaRDD into a single file"

Re: How do you write a JavaRDD into a single file

2014-10-21 Thread Steve Lewis

Collect will store the entire output in a List in memory. This solution is acceptable for "Little Data" problems although if the entire problem fits in the memory of a single machine there is less motivation to use Spark. Most problems which benefit from Spark are large enough that even the data a

Re: How do you write a JavaRDD into a single file

2014-10-20 Thread Ilya Ganelin

Hey Steve - the way to do this is to use the coalesce() function to coalesce your RDD into a single partition. Then you can do a saveAsTextFile and you'll wind up with outpuDir/part-0 containing all the data. -Ilya Ganelin On Mon, Oct 20, 2014 at 11:01 PM, jay vyas wrote: > sounds more like

Re: How do you write a JavaRDD into a single file

2014-10-20 Thread jay vyas

sounds more like a use case for using "collect"... and writing out the file in your program? On Mon, Oct 20, 2014 at 6:53 PM, Steve Lewis wrote: > Sorry I missed the discussion - although it did not answer the question - > In my case (and I suspect the askers) the 100 slaves are doing a lot of >

Re: How do you write a JavaRDD into a single file

2014-10-20 Thread Steve Lewis

Sorry I missed the discussion - although it did not answer the question - In my case (and I suspect the askers) the 100 slaves are doing a lot of useful work but the generated output is small enough to be handled by a single process. Many of the large data problems I have worked process a lot of da

Re: How do you write a JavaRDD into a single file

2014-10-20 Thread Sean Owen

This was covered a few days ago: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-write-a-RDD-into-One-Local-Existing-File-td16720.html The multiple output files is actually essential for parallelism, and certainly not a bad idea. You don't want 100 distributed workers writing to 1 file

How do you write a JavaRDD into a single file

2014-10-20 Thread Steve Lewis

At the end of a set of computation I have a JavaRDD . I want a single file where each string is printed in order. The data is small enough that it is acceptable to handle the printout on a single processor. It may be large enough that using collect to generate a list might be unacceptable. the sa

Re: How do you write a JavaRDD into a single file

Re: How do you write a JavaRDD into a single file

Re: How do you write a JavaRDD into a single file

Re: How do you write a JavaRDD into a single file

Re: How do you write a JavaRDD into a single file

How do you write a JavaRDD into a single file

6 matches

Site Navigation

Mail list logo

Footer information