Re: Region Servers going down frequently

Amandeep Khurana Wed, 08 Apr 2009 00:35:54 -0700

Hadoop and hbase are intelligent enough to balance the load. Its not very
frequent that you need to balance the load manually. Your cluster isnt
performing because of the low memory and the low limits on top of it. I dont
think the load is a problem at all.


Hadoop and Hbase are not designed for small data sizes and therefore dont
have the best performance when you have small files or small tables. The
most difficult part of hbase is starting up and increasing the table to a
certain threshold level. You'll encounter troubles in that phase (which you
already are). After that, its a breeze...


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Wed, Apr 8, 2009 at 12:29 AM, Rakhi Khatwani <[email protected]>wrote:

> Thanks, Amandeep
>
> One more question, i have mailed it earlier and i have attached the
> snapshot
> along with that email.
> I have noticed it that all my requests are handled by one region server...
> Is there any way to balance the load?
> and will balancing the load improve the performance?
>
> PS: I have tried using hadoop load balancing but after some time some of my
> region servers shut down... i have even gone through the archives and
> someone did report an unstable cluster due to load balancing. so i really
> dont know if i should turn load balancing on?
>
> Thanks,
> Raakhi
>
> On Wed, Apr 8, 2009 at 12:51 PM, Amandeep Khurana <[email protected]>
> wrote:
>
> > I'm not sure if I can answer that correctly or not. But my guess is no it
> > wont hamper the performance.
> >
> >
> > Amandeep Khurana
> > Computer Science Graduate Student
> > University of California, Santa Cruz
> >
> >
> > On Wed, Apr 8, 2009 at 12:13 AM, Rakhi Khatwani <
> [email protected]
> > >wrote:
> >
> > > Hi Amandeep,
> > >
> > > But in That case, if I let hbase split it automatically, my table with
> > > 17000
> > > rows will have only one region. thus my analysis will have only one
> map.
> > > won't the analysis process be slower in that case??
> > >
> > > Thanks,
> > > Raakhi
> > >
> > > On Wed, Apr 8, 2009 at 12:35 PM, Amandeep Khurana <[email protected]>
> > > wrote:
> > >
> > > > You cant compensate the RAM with processing power. Hbase keeps a lot
> of
> > > > open
> > > > file handles in hdfs which needs memory so you need the RAM.
> > > >
> > > > Secondly, 17000 rows isnt much to cause a region split. I dont know
> > exact
> > > > numbers but I had a table with 6 million rows and only 3 regions. So,
> > > thats
> > > > not a big deal.
> > > >
> > > > Thirdly, try with upping the xceivers and ulimit and see if it works
> > with
> > > > the existing RAM... Thats the only way out.
> > > >
> > > >
> > > > Amandeep Khurana
> > > > Computer Science Graduate Student
> > > > University of California, Santa Cruz
> > > >
> > > >
> > > > On Wed, Apr 8, 2009 at 12:02 AM, Rakhi Khatwani <
> > > [email protected]
> > > > >wrote:
> > > >
> > > > > Hi Amandeep,
> > > > >
> > > > > Following is my ec2 cluster configuration:
> > > > > High-CPU Medium Instance 1.7 GB of memory, 5 EC2 Compute Units (2
> > > virtual
> > > > > cores with 2.5 EC2 Compute Units each), 350 GB of instance storage,
> > > > 32-bit
> > > > > platform
> > > > >
> > > > > so I don't think I have much option when it comes to the GB part.
> > > > iHowever,
> > > > > is there any way i can make use of 5ec2 compute units to increase
> my
> > > > > performance?
> > > > >
> > > > > Regarding the table splits, I dont see hbase doing the table spilts
> > > > > automatically.
> > > > > After loading about 17000 rows in table1, I can still see it as one
> > > > region
> > > > > (after checking it on web UI). thats why i had to manually split
> it.
> > or
> > > > is
> > > > > there any configuration/settings I have to do to ensure that the
> > tables
> > > > are
> > > > > split automatically?
> > > > >
> > > > > I will increase the dataXceivers and ulimit to 32k
> > > > >
> > > > > Thanks a ton
> > > > > Rakhi.
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > > > Hi Amandeep,
> > > > > > >                  I have 1GB Memory on each node on ec2
> cluster(C1
> > > > > Medium)
> > > > > > .
> > > > > > > i am using hadoop-0.19.0 and hbase-0.19.0
> > > > > > > well we were starting with 10,000 rows, but later it will go up
> > to
> > > > > > 100,000
> > > > > > > rows.
> > > > > >
> > > > > >
> > > > > > 1GB is too low. You need around 4GB to get a stable system.
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > my map task basically reads an hbase table 'Table1', performs
> > > > analysis
> > > > > on
> > > > > > > each row, and dumps the analysis results into another hbase
> table
> > > > > > 'Table2'.
> > > > > > > each analysis task takes about 3-4 minutes when tested on local
> > > > machine
> > > > > > > (the
> > > > > > > algorithm part.... w/o the map reduce).
> > > > > > >
> > > > > > > i have divided 'Table1' to 30 regions b4 sending it to the map.
> > and
> > > > set
> > > > > > the
> > > > > > > maximum number of map tasks to 20.
> > > > > >
> > > > > > Let hbase do the division into regions. Leave the table as it is
> in
> > > > > default
> > > > > > state.
> > > > > >
> > > > > > >
> > > > > > > i have set DataXceivers to 1024 and uLimit to 1024
> > > > > >
> > > > > > yes.. increase these..
> > > > > > 2048 dataxceivers and 32k ulimit.
> > > > > >
> > > > > > >
> > > > > > > i am able to process about 300 rows in an hour which i feel
> quite
> > > > > slow...
> > > > > > > how do i increase the performance.
> > > > > >
> > > > > > the reaons are mentioned above.
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > meanwhile i will try settin the dataXceivers to 2048 and
> > increasing
> > > > the
> > > > > > > file
> > > > > > > limit as you mentioned.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Rakhi
> > > > > > >
> > > > > > > On Wed, Apr 8, 2009 at 11:40 AM, Amandeep Khurana <
> > > [email protected]>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > 20 nodes is good enough to begins with. How much memory do
> you
> > > have
> > > > > on
> > > > > > > each
> > > > > > > > node? IMO, you should keep 1GB per daemon and 1GB for the MR
> > job
> > > > like
> > > > > > > > Andrew
> > > > > > > > suggested.
> > > > > > > > You dont necessarily have to separate the datanodes and
> > > > tasktrackers
> > > > > as
> > > > > > > > long
> > > > > > > > as you have enough resources.
> > > > > > > > 10000 rows isnt big at all from hbase standpoint. What kind
> of
> > > > > > > computation
> > > > > > > > are you doing before dumping data into hbase? And what
> versions
> > > of
> > > > > > Hadoop
> > > > > > > > and Hbase are you running?
> > > > > > > >
> > > > > > > > There's another thing you should do. Increase the
> DataXceivers
> > > > limit
> > > > > to
> > > > > > > > 2048
> > > > > > > > (thats what I use).
> > > > > > > >
> > > > > > > > If you have root privelege over the cluster, then increase
> the
> > > file
> > > > > > limit
> > > > > > > > to
> > > > > > > > 32k (see hbase faq for details).
> > > > > > > >
> > > > > > > > Try this out and see how it goes.
> > > > > > > >
> > > > > > > >
> > > > > > > > Amandeep Khurana
> > > > > > > > Computer Science Graduate Student
> > > > > > > > University of California, Santa Cruz
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, Apr 7, 2009 at 2:45 AM, Rakhi Khatwani <
> > > > > > [email protected]
> > > > > > > > >wrote:
> > > > > > > >
> > > > > > > > > Hi,
> > > > > > > > >      I have a 20 node cluster on ec2(small instance).... i
> > have
> > > a
> > > > > set
> > > > > > > of
> > > > > > > > > tables which store huge amount of data (tried wid 10,000
> > > rows...
> > > > > more
> > > > > > > to
> > > > > > > > be
> > > > > > > > > added).... but during my map reduce jobs, some of the
> region
> > > > > servers
> > > > > > > shut
> > > > > > > > > down thereby causing data loss, stop in my program
> execution
> > > and
> > > > > > infact
> > > > > > > > one
> > > > > > > > > of my tables got damaged. when ever i scan the table, i get
> > the
> > > > > could
> > > > > > > not
> > > > > > > > > obtain block error.
> > > > > > > > >
> > > > > > > > > 1. i want to make the cluster more robust. since it
> contains
> > a
> > > > lot
> > > > > of
> > > > > > > > data.
> > > > > > > > > and its really important that they remain stable.
> > > > > > > > > 2. if one of my tables gets damaged (even after restarting
> > dfs
> > > n
> > > > > > > hbase),
> > > > > > > > > how
> > > > > > > > > do i go about recovering it?
> > > > > > > > >
> > > > > > > > > my ec2 cluster mostly has the default configuration.
> > > > > > > > > with hadoop-site n hbase-site have some entries pertaining
> to
> > > > > > > map-reduce
> > > > > > > > > (for example. num of map tasks, mapred.task.timeout etc).
> > > > > > > > >
> > > > > > > > > Your help will be greatly appreciated.
> > > > > > > > > Thanks,
> > > > > > > > > Raakhi Khatwani
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Region Servers going down frequently

Reply via email to