[jira] Commented: (HBASE-1961) HBase EC2 scripts

Andrew Purtell (JIRA) Thu, 19 Nov 2009 15:32:11 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-1961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780328#action_12780328
 ]


Andrew Purtell commented on HBASE-1961:
---------------------------------------

Feedback up on hbase-user@ from Naresh Rapolu:

{quote}
Your scripts are working fine.  We restarted everything and  tested, and they 
are working fine.  A few issues though :
-  While starting,  launch-hbase-cluster  gives the following  error.
  error:  "fs.epoll.max_user_instance"  is an unknown key.    It occurs during  
starting zookeeper instances. 
-  We needed MapReduce along with HBase.  The note on the JIRA page that you 
only need to add only two lines in hbase-ec2-env.sh    is insufficient.
  The following changes need to be made.
  1. hbase-ec2-env.sh  should write  mapred.job.tracker  property into  
hadoop-site.xml  (  Also shouldnt you be having  core-site.xml and 
hdfs-site.xml  as it is  hadoop-0.20.1 ???  Infact because of this , there are 
warning messages all over the place when you are using  hdfs  through command 
line ).
  2.  HADOOP_CLASSPATH  in  hadoop/conf/hadoop-env.sh  needs to be changed in 
the underlying  AMI,  to include  hbase, zookeeper jars and conf directory.    
Probably you can modify the public AMI, and recreate the bundle  as the  paths 
to these are known apriori.  3.  For other users,  the following three lines 
should be added in  hbase-ec2-env.sh
      For master:
      "$HADOOP_HOME"/bin/hadoop-daemon.sh start jobtracker
      "$HADOOP_HOME"/bin/hadoop-daemon.sh start tasktracker
      For slave:
      "$HADOOP_HOME"/bin/hadoop-daemon.sh start tasktracker.
{quote}

Incorporate these suggestions. 

bq. error:  "fs.epoll.max_user_instance"  is an unknown key

This is a bit of future proofing. That's not a known sysctl key until kernel 
2.6.27, at which point oddly low epoll user descriptor limits go into effect. 
See 
http://pero.blogs.aprilmayjune.org/2009/01/22/hadoop-and-linux-kernel-2627-epoll-limits/.
 At some point there may be a 2.6.27 based AKI. I could /dev/null the message 
but then some other more serious potential problem with sysctl would be hidden.

bq. Also shouldnt you be having  core-site.xml and hdfs-site.xml  as it is  
hadoop-0.20.1

Yes. What I did for this initial work is adapt the Hadoop EC2 scripts, which 
target 0.19. 

> HBase EC2 scripts
> -----------------
>
>                 Key: HBASE-1961
>                 URL: https://issues.apache.org/jira/browse/HBASE-1961
>             Project: Hadoop HBase
>          Issue Type: New Feature
>         Environment: Amazon AWS EC2
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.21.0, 0.20.3
>
>         Attachments: ec2-contrib.tar.gz
>
>
> Attached tarball is a clone of the Hadoop EC2 scripts, modified significantly 
> to start up a HBase storage only cluster on top of HDFS backed by instance 
> storage. 
> Tested with the HBase 0.20 branch but should work with trunk also. Only the 
> AMI create and launch scripts are tested. Will bring up a functioning HBase 
> cluster. 
> Do "create-hbase-image c1.xlarge" to create an x86_64 AMI, or 
> "create-hbase-image c1.medium" to create an i386 AMI.  Public Hadoop/HBase 
> 0.20.1 AMIs are available:
>     i386: ami-c644a7af
>     x86_64: ami-f244a79b
> launch-hbase-cluster brings up the cluster: First, a small dedicated ZK 
> quorum, specifiable in size, default of 3. Then, the DFS namenode (formatting 
> on first boot) and one datanode and the HBase master. Then, a specifiable 
> number of slaves, instances running DFS datanodes and HBase region servers.  
> For example:
> {noformat}
>     launch-hbase-cluster testcluster 100 5
> {noformat}
> would bring up a cluster with 100 slaves supported by a 5 node ZK ensemble.
> We must colocate a datanode with the namenode because currently the master 
> won't tolerate a brand new DFS with only namenode and no datanodes up yet. 
> See HBASE-1960. By default the launch scripts provision ZooKeeper as 
> c1.medium and the HBase master and region servers as c1.xlarge. The result is 
> a HBase cluster supported by a ZooKeeper ensemble. ZK ensembles are not 
> dynamic, but HBase clusters can be grown by simply starting up more slaves, 
> just like Hadoop. 
> hbase-ec2-init-remote.sh can be trivially edited to bring up a jobtracker on 
> the master node and task trackers on the slaves.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1961) HBase EC2 scripts

Reply via email to