Ah, I see my knowledge is now out of date -- http://wiki.apache.org/lucene-hadoop/HowToConfigure
-Michael On 11/30/07 10:30 AM, "Michael Bieniosek" <[EMAIL PROTECTED]> wrote: The value in hadoop-site.xml overrides the value set programmatically. You can set a value for maptasks/reducetasks in mapred-default.xml instead of hadoop-site.xml -- this value will serve as a default that can be overridden programmatically. However, mapred-default.xml is due to be eliminated in 0.16, and I am not sure what the recommended way now is. -Michael On 11/30/07 12:00 AM, "Jason Venner" <[EMAIL PROTECTED]> wrote: We have several 8 processor machines in our cluster, and for most of our mapper tasks we would like to spawn 8 per machine. We have 1 mapper task that is extremely resource intensive and we can only spawn 1. We do have multiple arms for our DFS, so we would like to run multiple reduce jobs on each machine. We have had little luck changing these parameters by setting the numbers via JobConf jobConf.setNumMapTasks(int n) jobConf.setNumReduceTasks(int n) What we have ended up doing is reconfiguring the cluster by changing the hadoop-site.xml between the different runs, which is awkward. Have we just fumble fingered it, or is there a way, that we are missing to set the concurrency for mappers and reducers, on a per job basis?
