> Problem is - I don't know what HBase configurations to use
> in my MapReduce program to point to HBase on another EC2 machine.

1) Copy the hbase-site.xml from the HBase cluster master 
   /usr/local/hbase*/conf/hbase-site.xml
and put it on the classpath on your Hadoop cluster. Make sure you
have a hbase-default.xml on the classpath on the Hadoop cluster
also. 

2) Make sure your Hadoop cluster instances can communicate with
the HBase zookeeper, master, and slave security groups. Typically
this means you have to execute a number of ec2-authorize commands
of the form:

   ec2-authorize <group-1> -o <group-2> -u <account-id>
   ec2-authorize <group-2> -o <group-1> -u <account-id>

where group-1 is foreach all of your Hadoop cluster's security
groups, and group-2 is foreach all of your HBase cluster's
security groups. It's annoying, but you only have to do it once
and the changes will persist in your security group ACLs. 

   - Andy



________________________________
From: Something Something <[email protected]>
To: [email protected]
Sent: Fri, December 11, 2009 9:50:30 AM
Subject: Re: Starting HBase in fully distributed mode...

1)  Yes, I used the same cluster name.  Okay, let me try again tonight, but
in any case, I was able to ssh to Master and confirm setup.
2)  I tried the Hadoop EC2 scripts last night.  I keep getting 'Waiting for
instance to start' and seems like it gets stuck there.  Also, keep getting
several message like this...
.Required option '-K, --private-key KEY' missing (-h for usage)

Seems like I haven't set *something* correctly.  Will look into this tonight
as well.

3)  Not sure what you mean here.  Yes, my Hadoop machines will be on EC2 as
well.

Here's my plan for the weekend:

Start Hadoop instances on 10 EC2 machines.
Start HBase on 5 EC2 machines along with Zookeeper on 5 machines.
Start a  MapReduce job on Hadoop (master) instance.

Problem is - I don't know what HBase configurations to use in my MapReduce
program to point to HBase on another EC2 machine.  Makes sense?



On Fri, Dec 11, 2009 at 12:06 AM, Andrew Purtell <[email protected]>wrote:

> > ./bin/hbase-ec2 login testcluster
> >
> > Use this to login.  I tried running this from my local machine, but
> nothing
> > *noteworthy* happened.
>
> Did you replace "testcluster" with the name you used when launching your
> cluster, assuming they are different? The scripts address clusters by the
> labels you give them when launching them. E.g.
>
>   ./bin/hbase-ec2 launch foo 3 3
>
> launches a cluster named "foo", and
>
>   ./bin/hbase-ec2 login foo
>
> opens a SSH shell on the master of cluster "foo".
>
> > Did you also create similar scripts for Hadoop?
>
> Hadoop has its own set of EC2 scripts. I used those as the basis for ours.
> You can't use the HBase and Hadoop EC2 scripts together however.
>
> > Later I want to start a MapReduce job on my Hadoop machines that will
> > access this HBase cluster.  How would I do that?
>
> Are your Hadoop machines up on EC2 also?
>
> Running mapreduce jobs on the HBase cluster itself is a work in progress.
>
>   - Andy
>
>
>
> ________________________________
> From: Something Something <[email protected]>
> To: [email protected]
> Sent: Thu, December 10, 2009 8:21:10 PM
> Subject: Re: Starting HBase in fully distributed mode...
>
> Andy,
>
> Thanks for the tips.  It's all working now.  I was using a different
> KeyPair
> for EC2_ROOT_SSH_KEY.  Once I changed this to use the root.pem it started
> working.  I was able to ssh to the 'master' instance and get into hbase
> shell etc.  This script is VERY helpful!  Thank you so much.
>
> A few questions...
>
> 1)  The README.txt file says this..
>
> ./bin/hbase-ec2 login testcluster
>
> Use this to login.  I tried running this from my local machine, but nothing
> *noteworthy* happened.  I wasn't able to get into the hbase shell from my
> local machine.  Anyway, this is not a big deal for me.
>
> 2)  Did you also create similar scripts for Hadoop?  (I guess I will look
> into the trunk!).
>
> 3)  Say I use your script to start HBase on a few machines, and start
> Hadoop
> on some other machines.  Later I want to start a MapReduce job on my Hadoop
> machines that will access this HBase cluster.  How would I do that?  What
> HBase configurations can I use?  So far my Mapreduce job always accesses
> HBase on the same machine.
>
> Thanks once again for your help.
>
>
>
> On Thu, Dec 10, 2009 at 5:30 PM, Vaibhav Puranik <[email protected]>
> wrote:
>
> > We have HBase running on EC2 with starting Zookeeper within HBase. We
> have
> > it up since July 2009. No problems so far on Zookeeper front.
> >
> > Regards,
> > Vaibhav Puranik
> > Gumgum
> >
> > On Thu, Dec 10, 2009 at 8:12 AM, Something Something <
> > [email protected]> wrote:
> >
> > > Finally, I was able to get HBase running on EC2 in fully distributed
> > mode.
> > >  I started ZooKeeper quorum myself and pointed HBase to it.  I was able
> > to
> > > create tables using HBase shell, ran a Mapreduce job that writes to
> these
> > > tables, and run queries against these tables.  I used HBase shell from
> > all
> > > 3
> > > machines, and they all see the same data confirming that the instances
> > are
> > > indeed working together.
> > >
> > > It seems like under EC2, starting ZooKeeper within HBase doesn't work,
> > but
> > > I
> > > could be wrong.
> > >
> > > In any case, Andrew, I would like to get your scripts working in my
> > > environment because without your scripts I don't know how I would grow
> my
> > > cluster from 3 instances to say, 30 :)
> > >
> > > Thank you so much everyone for your help and for sticking with me.
> > >
> > >
> > > On Wed, Dec 9, 2009 at 8:25 PM, Something Something <
> > > [email protected]> wrote:
> > >
> > > > When I run:
> > > >
> > > > hbase-ec2 launch-cluster testcluster 3 3
> > > >
> > > > I keep getting 'lost connection' messages (See below).  Tried this 4
> > > > times.  Please help.  Thanks.
> > > >
> > > >
> > > > -------------------------------------------------------------
> > > >
> > > > Creating/checking security groups
> > > > Security group testcluster-master exists, ok
> > > > Security group testcluster exists, ok
> > > > Security group testcluster-zookeeper exists, ok
> > > > Starting ZooKeeper quorum ensemble.
> > > > Starting an AMI with ID ami-b0cb29d9 (arch i386) in group
> > > > testcluster-zookeeper
> > > > Waiting for instance i-9db6f4f5 to start: .................. Started
> > > > ZooKeeper instance i-9db6f4f5 as
> > > domU-12-31-38-01-7D-D1.compute-1.internal
> > > >     Public DNS name is ec2-174-129-148-5.compute-1.amazonaws.com.
> > > > Starting an AMI with ID ami-b0cb29d9 (arch i386) in group
> > > > testcluster-zookeeper
> > > > Waiting for instance i-2db7f545 to start: ................. Started
> > > > ZooKeeper instance i-2db7f545 as
> > > domU-12-31-38-01-7D-43.compute-1.internal
> > > >     Public DNS name is ec2-174-129-157-122.compute-1.amazonaws.com.
> > > > Starting an AMI with ID ami-b0cb29d9 (arch i386) in group
> > > > testcluster-zookeeper
> > > > Waiting for instance i-afb7f5c7 to start: ......................
> > Started
> > > > ZooKeeper instance i-afb7f5c7 as
> > > domU-12-31-38-01-78-F3.compute-1.internal
> > > >     Public DNS name is ec2-174-129-179-14.compute-1.amazonaws.com.
> > > > ZooKeeper quorum is
> > > >
> > >
> >
> domU-12-31-38-01-7D-D1.compute-1.internal,domU-12-31-38-01-7D-43.compute-1.internal,domU-12-31-38-01-78-F3.compute-1.internal.
> > > > Initializing the ZooKeeper quorum ensemble.
> > > >     ec2-174-129-148-5.compute-1.amazonaws.com
> > > > lost connection
> > > >     ec2-174-129-157-122.compute-1.amazonaws.com
> > > > lost connection
> > > >     ec2-174-129-179-14.compute-1.amazonaws.com
> > > > lost connection
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, Dec 9, 2009 at 12:46 AM, Seth Ladd <[email protected]>
> wrote:
> > > >
> > > >> > Sounds like others have used Andrew's script successfully.  The
> only
> > > >> > difference seems to be that it starts a *dedicated* ZooKeeper
> > quorum.
> > > >> > Should have listened to Mark when he suggested that 4 days ago :)
> > > >> >
> > > >> > Anyway, I will try Andrew's script tomorrow.
> > > >>
> > > >> I can vouch that the scripts in svn trunk work.  Thanks to Andrew
> for
> > > >> his help!  I was able to start a 3 node Zookeeper and 5 node HBase
> > > >> cluster on EC2 from just the scripts.
> > > >>
> > > >> Seth
> > > >>
> > > >
> > > >
> > >
> >
>
>
>
>
>



      

Reply via email to