I'm having problems getting my data reduced evenly across nodes. -> map a 200,000 line single text file and output <0L,line> -> custom partitioner returning static member i++%numPartitions in an attempt to distribute each line to as many reducers as possible -> reduce; I end up with 13 or 18 nodes busy of 100 nodes.
My hope is to have 300 containers on 100 nodes; each with ~666 lines each. How can i achieve this?
