[ https://issues.apache.org/jira/browse/HBASE-1961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780328#action_12780328 ]
Andrew Purtell commented on HBASE-1961: --------------------------------------- Feedback up on hbase-user@ from Naresh Rapolu: {quote} Your scripts are working fine. We restarted everything and tested, and they are working fine. A few issues though : - While starting, launch-hbase-cluster gives the following error. error: "fs.epoll.max_user_instance" is an unknown key. It occurs during starting zookeeper instances. - We needed MapReduce along with HBase. The note on the JIRA page that you only need to add only two lines in hbase-ec2-env.sh is insufficient. The following changes need to be made. 1. hbase-ec2-env.sh should write mapred.job.tracker property into hadoop-site.xml ( Also shouldnt you be having core-site.xml and hdfs-site.xml as it is hadoop-0.20.1 ??? Infact because of this , there are warning messages all over the place when you are using hdfs through command line ). 2. HADOOP_CLASSPATH in hadoop/conf/hadoop-env.sh needs to be changed in the underlying AMI, to include hbase, zookeeper jars and conf directory. Probably you can modify the public AMI, and recreate the bundle as the paths to these are known apriori. 3. For other users, the following three lines should be added in hbase-ec2-env.sh For master: "$HADOOP_HOME"/bin/hadoop-daemon.sh start jobtracker "$HADOOP_HOME"/bin/hadoop-daemon.sh start tasktracker For slave: "$HADOOP_HOME"/bin/hadoop-daemon.sh start tasktracker. {quote} Incorporate these suggestions. bq. error: "fs.epoll.max_user_instance" is an unknown key This is a bit of future proofing. That's not a known sysctl key until kernel 2.6.27, at which point oddly low epoll user descriptor limits go into effect. See http://pero.blogs.aprilmayjune.org/2009/01/22/hadoop-and-linux-kernel-2627-epoll-limits/. At some point there may be a 2.6.27 based AKI. I could /dev/null the message but then some other more serious potential problem with sysctl would be hidden. bq. Also shouldnt you be having core-site.xml and hdfs-site.xml as it is hadoop-0.20.1 Yes. What I did for this initial work is adapt the Hadoop EC2 scripts, which target 0.19. > HBase EC2 scripts > ----------------- > > Key: HBASE-1961 > URL: https://issues.apache.org/jira/browse/HBASE-1961 > Project: Hadoop HBase > Issue Type: New Feature > Environment: Amazon AWS EC2 > Reporter: Andrew Purtell > Assignee: Andrew Purtell > Priority: Minor > Fix For: 0.21.0, 0.20.3 > > Attachments: ec2-contrib.tar.gz > > > Attached tarball is a clone of the Hadoop EC2 scripts, modified significantly > to start up a HBase storage only cluster on top of HDFS backed by instance > storage. > Tested with the HBase 0.20 branch but should work with trunk also. Only the > AMI create and launch scripts are tested. Will bring up a functioning HBase > cluster. > Do "create-hbase-image c1.xlarge" to create an x86_64 AMI, or > "create-hbase-image c1.medium" to create an i386 AMI. Public Hadoop/HBase > 0.20.1 AMIs are available: > i386: ami-c644a7af > x86_64: ami-f244a79b > launch-hbase-cluster brings up the cluster: First, a small dedicated ZK > quorum, specifiable in size, default of 3. Then, the DFS namenode (formatting > on first boot) and one datanode and the HBase master. Then, a specifiable > number of slaves, instances running DFS datanodes and HBase region servers. > For example: > {noformat} > launch-hbase-cluster testcluster 100 5 > {noformat} > would bring up a cluster with 100 slaves supported by a 5 node ZK ensemble. > We must colocate a datanode with the namenode because currently the master > won't tolerate a brand new DFS with only namenode and no datanodes up yet. > See HBASE-1960. By default the launch scripts provision ZooKeeper as > c1.medium and the HBase master and region servers as c1.xlarge. The result is > a HBase cluster supported by a ZooKeeper ensemble. ZK ensembles are not > dynamic, but HBase clusters can be grown by simply starting up more slaves, > just like Hadoop. > hbase-ec2-init-remote.sh can be trivially edited to bring up a jobtracker on > the master node and task trackers on the slaves. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.