You probably have determined by now that there is a parameter that determines how many concurrent maps there are.
<property> <name>mapred.tasktracker.tasks.maximum</name> <value>3</value> <description>The maximum number of tasks that will be run simultaneously by a task tracker. </description> </property> Btw... I am still curious about your approach. Isn't it normally better to measure marginal costs such as this startup cost by linear regression as you change parameters? It seems that otherwise, you will likely be mislead by what happens at the boundaries when what you really want it what happens in the normal operating region. On 10/22/07 5:53 PM, "Lance Amundsen" <[EMAIL PROTECTED]> wrote: > ... > > Next I want to increase the concurrent # of tasks being executed for each > node... currently it seems like 2 or 3 is the upper limit (at least on the > earlier binaries I was running). >