I am struggling to control the behavior of the framework.  The first
problem is simple: I want to run many simultaneous mapper tasks on each
node.  I've scoured the forums, done the obvious, and I still typically get
only 2 tasks per node at execution time.  If it is a big job, sometimes I
see 3.  Note that the administrator reports 40 Tasks/Node in the config,
but the most I've ever seen running is 3 (and this with a single input file
of 10,000 records, magically yielding 443 maps).

And magically is the next issue.  I want to fine tune control the
InputFile, Input # records, to maps relationship.  For my immediate
problem, I want to use a single input file with a number of records
yielding the exact same number of maps (all kicked off simultaneously BTW).
Since I did not get this behavior with the standard InputFileFormat, I
created my own input format class and record reader, and am now getting the
"1 file with n recs to nmaps" relationship.... but the problem is that I am
not even sure why....

Any guidance appreciated.


Reply via email to