Re: Improve saveAsTextFile performance

2015-12-07 Thread Akhil Das
3b#file-saveasparquet-java-L80 > > Best Regards, > Ram > -- > > Date: Saturday, December 5, 2015 at 7:18 AM > To: Akhil Das > > Cc: user > Subject: Re: Improve saveAsTextFile performance > > >If you are doing a join/groupBy kind of operations then you need t

Re: Improve saveAsTextFile performance

2015-12-05 Thread Ram VISWANADHA
, Ram -- Date: Saturday, December 5, 2015 at 7:18 AM To: Akhil Das mailto:ak...@sigmoidanalytics.com>> Cc: user mailto:user@spark.apache.org>> Subject: Re: Improve saveAsTextFile performance >If you are doing a join/groupBy kind of operations then you need to make sure >the keys are

Re: Improve saveAsTextFile performance

2015-12-05 Thread Ram VISWANADHA
0 8 3.9 MB / 95334 Best Regards, Ram From: Akhil Das mailto:ak...@sigmoidanalytics.com>> Date: Saturday, December 5, 2015 at 1:32 AM To: Ram VISWANADHA mailto:ram.viswana...@dailymotion.com>> Cc: user mailto:user@spark.apache.org>> Subject: Re: Improve saveAsTex

Re: Improve saveAsTextFile performance

2015-12-05 Thread Akhil Das
ost-never-finishes > > Best Regards, > Ram > > From: Sahil Sareen > Date: Wednesday, December 2, 2015 at 10:18 PM > To: Ram VISWANADHA > Cc: Ted Yu , user > Subject: Re: Improve saveAsTextFile performance > > > http://stackoverflow.com/questions/29213404/how-

Re: Improve saveAsTextFile performance

2015-12-04 Thread Ram VISWANADHA
ANADHA mailto:ram.viswana...@dailymotion.com>> Cc: Ted Yu mailto:yuzhih...@gmail.com>>, user mailto:user@spark.apache.org>> Subject: Re: Improve saveAsTextFile performance http://stackoverflow.com/questions/29213404/how-to-split-an-rdd-into-multiple-smaller-rdds-given-a-max-number-of-rows-per

Re: Improve saveAsTextFile performance

2015-12-02 Thread Sahil Sareen
From: Ted Yu > Date: Wednesday, December 2, 2015 at 3:25 PM > To: Ram VISWANADHA > Cc: user > Subject: Re: Improve saveAsTextFile performance > > Have you tried calling coalesce() before saveAsTextFile ? > > Cheers > > On Wed, Dec 2, 2015 at 3:15 PM, Ram V

Re: Improve saveAsTextFile performance

2015-12-02 Thread Ram VISWANADHA
Yes. That did not help. Best Regards, Ram From: Ted Yu mailto:yuzhih...@gmail.com>> Date: Wednesday, December 2, 2015 at 3:25 PM To: Ram VISWANADHA mailto:ram.viswana...@dailymotion.com>> Cc: user mailto:user@spark.apache.org>> Subject: Re: Improve saveAsTextFile performan

Re: Improve saveAsTextFile performance

2015-12-02 Thread Ted Yu
Have you tried calling coalesce() before saveAsTextFile ? Cheers On Wed, Dec 2, 2015 at 3:15 PM, Ram VISWANADHA < ram.viswana...@dailymotion.com> wrote: > JavaRDD.saveAsTextFile is taking a long time to succeed. There are 10 > tasks, the first 9 complete in a reasonable time but the last task is

Improve saveAsTextFile performance

2015-12-02 Thread Ram VISWANADHA
JavaRDD.saveAsTextFile is taking a long time to succeed. There are 10 tasks, the first 9 complete in a reasonable time but the last task is taking a long time to complete. The last task contains the maximum number of records like 90% of the total number of records. Is there any way to paralleli