Where do I find information about which config parameters can be set as 
per-node property, and which ones apply to all nodes? For example, I have a 
cluster consisting of two classes of nodes. One class is dual-core 4GB memory 
nodes, and the other class is 16-core 128GB memory nodes. It certainly makes 
sense to configure them differently. So the questions is, which parameters I 
should pay attention to? I vaguely know that probably at least the following 
ones can be set as node-specific:

    mapred.tasktracker.map.tasks.maximum
    mapred.tasktracker.reduce.tasks.maximum
    

But anything beyond that? How about the following ones, can I set them as 
node-specific parameters?

    mapred.child.java.opts
    tasktracker.http.threads
    dfs.datanode.handler.count
    io.sort.factor
    io.sort.mb
    mapred.inmem.merge.threshold
    mapred.job.reduce.input.buffer.percent


Thanks!

Zhang

Reply via email to