Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The following page has been changed by SteveLoughran: http://wiki.apache.org/hadoop/LargeClusterTips The comment on the change is: More big cluster tips ------------------------------------------------------------------------------ Below are tips for managing large clusters. + * Have a good sysadmin if you're not one yourself. * Take a look at a presentation done by Allen Wittenauer from Yahoo!: http://tinyurl.com/5foamm + * Have the LAN closed off to untrusted users. This simplifies security. + * Use LDAP or similar to manage user accounts. - * Only put the slaves file on your namenode and secondary namenode to prevent confusion + * Only put the slaves file on your namenode and secondary namenode to prevent confusion. + * Have identical hardware on all machines in the cluster, eliminating the need to have different + configuration options (task slots, data directory locations, etc) + * Use RPMs to install the Hadoop binaries. Self:Cloudera provide some RPMs for this, and a web site to generate configuration RPM files. + * Use kickstart or similar to bring up the machines. - * Use a system configuration management package to keep Hadoop's source consistent across all nodes. Some example packages are bcfg2, smartfrog, puppet, cfengine, etc. + * Consider a system configuration management package to keep Hadoop's source and configuration consistent across all nodes. Some example packages are bcfg2, smartfrog, puppet, cfengine, etc. - * Have a good sysadmin if you're not one + * If you are trying to configure the machines one by one, step away from the keyboard. That is not the way to manage a cluster. See the Self:AmazonEC2 and AmazonS3 pages for tips on managing clusters built on EC2 and S3. - Other good documentation: http://wiki.smartfrog.org/wiki/display/sf/Patterns+of+Hadoop+Deployment + Other good documentation: [http://wiki.smartfrog.org/wiki/display/sf/Patterns+of+Hadoop+Deployment Patterns of Hadoop Deployment]
