Radim,

In this case, it doesn't matter how many mappers you provide in your
job configuration. Hadoop will only give 1 mapper per split. Since
your files are less than 64MB (assuming you're using the default block
size of HDFS), you only have 2 splits. If you really need more
mappers, you need to create smaller input files.

Paragraph 1 under the Map heading on this page explains it as well:
http://wiki.apache.org/hadoop/HadoopMapReduce

Justin

2011/11/9 Radim Kolar <h...@sendmail.cz>:
> I have 2 input seq files 32MB each. I want to run them on as many mappers as
> possible.
>
> i appended  -D mapred.max.split.size=1000000 as command line argument to
> job, but there is no difference. Job still runs on 2 mappers.
>
> How split size works? Is max split size used for reading or writing files?
>
> it works like this?:  set maxsplitsize, write files and you will get bunch
> of seq files as output. then you will get same number of mappers as input
> files.
>

Reply via email to