I am saying to partition something like partitionBy(new HashPartitioner(16) will this not work?
On 17 April 2015 at 21:28, Jeetendra Gangele <gangele...@gmail.com> wrote: > I have given 3000 task to mapToPair now its taking so much memory and > shuffling and wasting time there. Here is the stats when I run with very > small data almost for all data its doing shuffling not sure what is > happening here any idea? > > > - *Total task time across all tasks: *11.0 h > - *Shuffle read: *153.8 MB > - *Shuffle write: *288.0 MB > > > On 17 April 2015 at 14:32, Jeetendra Gangele <gangele...@gmail.com> wrote: > >> mapToPair is running with 32 tasks but very slow because lot of shuffles >> read. attaching screen shot >> each task is running from 10 mins. even Though Inside function i m not >> doing anything costly. >> > > > > >