This is really a Hadoop-level thing. I am not sure I have ever successfully induced M/R to run multiple mappers on less than one block of data, even with a low max split size. Reducers you can control.
On Thu, Mar 28, 2013 at 9:04 AM, Sebastian Briesemeister <[email protected]> wrote: > Thank you. > > Splitting the files leads to multiple MR-tasks! > > Only changing the MR settings of hadoop did not help. In the future it > would be nice if the drivers would scale themself and would split the > data according to the dataset size and the number of available MR-slots.
