Hello, I read the documentation about running multiple Mapper tasks, but I can't get multiple Mappers to work. I am running under EC2 with 10 nodes.
Here's what I know: 1) I guess, by default, No. of Mapper tasks will be decided by DFS block size, but I would like to override that. My file is small, but each line triggers fairly long running complicated calculations that should be run in parallel. 2) I tried setting the following property in the mapred-site.xml (only on Master), but that doesn't seem to help: <property> <name>mapred.map.tasks</name> <value>10</value> </property> I still see the following message: 10/01/18 01:56:34 INFO mapred.JobClient: Launched map tasks=1 10/01/18 01:56:34 INFO mapred.JobClient: Data-local map tasks=1 (Also, I know for fact that multiple mappers are not running!) 3) I read somewhere that JobConf has a method called setNumMapTasks, but this class has been deprecated, and as such I am not using. Besides this method just provides a hint to Hadoop, I heard. So how do I trigger multiple Mapper tasks? Please let me know. Thanks.