I currently have 3 servers, 1 serving as master and slave. Each has a different amount of memory available. Each has a different processor type.
I configure the values for mapred.map.tasks, mapred.reduce.tasks, and mapred.tasktracker.tasks.maximum in the mapred-default.xml file. I configure the values for mapred.child.java.opts in the hadoop-site.xml file. Am I putting the correct values in the correct files? With each server having different capabilities, is it best practice to have each server run using as much of it's capabilities as possible, maximizing heap size, and number of processes? Or, is it best practice to choose values that are the lowest common denominator? Also, does each slave look at it's local mapred configuration to determine memory to use for each child process? Is the number of tasks taken from the master configuration, or the slave configurations? Thanks in advance for any pointers or rules of thumb you can provide. JohnM -- john mendenhall [EMAIL PROTECTED] surf utopia internet services
