Re: Executors not utilized properly.

2014-06-17 Thread abhiguruvayya
Can some one help me with this. Any help is appreciated. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Executors-not-utilized-properly-tp7744p7753.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Executors not utilized properly.

2014-06-17 Thread Sean Owen
It sounds like your job has 9 tasks and all are executing simultaneously in parallel. This is as good as it gets right? Are you asking how to break the work into more tasks, like 120 to match your 10*12 cores? Make your RDD have more partitions. For example the textFile method can override the

Re: Executors not utilized properly.

2014-06-17 Thread abhiguruvayya
I did try creating more partitions by overriding the default number of partitions determined by HDFS splits. Problem is, in this case program will run for ever. I have same set of inputs for map reduce and spark. Where map reduce is taking 2 mins, spark is taking 5 min to complete the job. I

Re: Executors not utilized properly.

2014-06-17 Thread Jey Kottalam
Hi Abhishek, Where mapreduce is taking 2 mins, spark is taking 5 min to complete the job. Interesting. Could you tell us more about your program? A code skeleton would certainly be helpful. Thanks! -Jey On Tue, Jun 17, 2014 at 3:21 PM, abhiguruvayya sharath.abhis...@gmail.com wrote: I did

Re: Executors not utilized properly.

2014-06-17 Thread abhiguruvayya
I found the main reason to be that i was using coalesce instead of repartition. coalesce was shrinking the portioning so the number of tasks were very less to be executed by all of the executors. Can you help me in understudying when to use coalesce and when to use repartition. In application

Re: Executors not utilized properly.

2014-06-17 Thread Aaron Davidson
repartition() is actually just an alias of coalesce(), but which the shuffle flag to set to true. This shuffle is probably what you're seeing as taking longer, but it is required when you go from a smaller number of partitions to a larger. When actually decreasing the number of partitions,

Re: Executors not utilized properly.

2014-06-17 Thread abhiguruvayya
Perfect!! That makes so much sense to me now. Thanks a ton -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Executors-not-utilized-properly-tp7744p7793.html Sent from the Apache Spark User List mailing list archive at Nabble.com.