On Oct 26, 2008, at 8:38 AM, chaitanya krishna wrote:
I forgot to mention that although the number of map tasks are set in
the
code as I mentioned before, the actual number of map tasks are not
essentially the same number but is very close to this number.
The number of reduces is precisely the one configured by the job. The
number of maps depends on the InputFormat selected. For
FileInputFormats, which include TextInputFormat and
SequenceFileInputFormat, the formula is complicated, but it usually
defaults to the greater of the number requested or the number of hdfs
blocks in the input.
-- Owen