Not a complete list by far, but just a start :

For HDFS:

- Make sure you run Java 6 (jdk1.6).
- Set namenode handler count to 40 or more (dfs.namenode.handler.count, and may be mapred.job.tracker.handler.count etc).

- more config guides are in the works : https://issues.apache.org/jira/browse/HADOOP-1917

Raghu.

Derek Gottfrid wrote:
Are there configuration suggestions for 1k nodes ? I was seeing tons
of timeouts trying to run 1k nodes. Are there network settings that I
need to make? Out of the box stuff seemed to work up to a couple
hundred but I want to go bigger.  Pointers/Suggestions?

derek


ps: i wrote up my ec2/hadoop at nytimes.com - check it out
http://open.blogs.nytimes.com/2007/11/01/self-service-prorated-super-computing-fun/

Reply via email to