Re: Region Servers going down frequently

Amandeep Khurana Wed, 08 Apr 2009 00:41:25 -0700

When is 0.20 release expected?


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Wed, Apr 8, 2009 at 12:37 AM, Ryan Rawson <[email protected]> wrote:

> Just FYI, 0.20 handles small cell values substantially better than 0.19.1.
>
> -ryan
>
> On Wed, Apr 8, 2009 at 12:35 AM, Amandeep Khurana <[email protected]>
> wrote:
>
> > Hadoop and hbase are intelligent enough to balance the load. Its not very
> > frequent that you need to balance the load manually. Your cluster isnt
> > performing because of the low memory and the low limits on top of it. I
> > dont
> > think the load is a problem at all.
> >
> > Hadoop and Hbase are not designed for small data sizes and therefore dont
> > have the best performance when you have small files or small tables. The
> > most difficult part of hbase is starting up and increasing the table to a
> > certain threshold level. You'll encounter troubles in that phase (which
> you
> > already are). After that, its a breeze...
> >
> >
> > Amandeep Khurana
> > Computer Science Graduate Student
> > University of California, Santa Cruz
> >
> >
> > On Wed, Apr 8, 2009 at 12:29 AM, Rakhi Khatwani <
> [email protected]
> > >wrote:
> >
> > > Thanks, Amandeep
> > >
> > > One more question, i have mailed it earlier and i have attached the
> > > snapshot
> > > along with that email.
> > > I have noticed it that all my requests are handled by one region
> > server...
> > > Is there any way to balance the load?
> > > and will balancing the load improve the performance?
> > >
> > > PS: I have tried using hadoop load balancing but after some time some
> of
> > my
> > > region servers shut down... i have even gone through the archives and
> > > someone did report an unstable cluster due to load balancing. so i
> really
> > > dont know if i should turn load balancing on?
> > >
> > > Thanks,
> > > Raakhi
> > >
> > > On Wed, Apr 8, 2009 at 12:51 PM, Amandeep Khurana <[email protected]>
> > > wrote:
> > >
> > > > I'm not sure if I can answer that correctly or not. But my guess is
> no
> > it
> > > > wont hamper the performance.
> > > >
> > > >
> > > > Amandeep Khurana
> > > > Computer Science Graduate Student
> > > > University of California, Santa Cruz
> > > >
> > > >
> > > > On Wed, Apr 8, 2009 at 12:13 AM, Rakhi Khatwani <
> > > [email protected]
> > > > >wrote:
> > > >
> > > > > Hi Amandeep,
> > > > >
> > > > > But in That case, if I let hbase split it automatically, my table
> > with
> > > > > 17000
> > > > > rows will have only one region. thus my analysis will have only one
> > > map.
> > > > > won't the analysis process be slower in that case??
> > > > >
> > > > > Thanks,
> > > > > Raakhi
> > > > >
> > > > > On Wed, Apr 8, 2009 at 12:35 PM, Amandeep Khurana <
> [email protected]>
> > > > > wrote:
> > > > >
> > > > > > You cant compensate the RAM with processing power. Hbase keeps a
> > lot
> > > of
> > > > > > open
> > > > > > file handles in hdfs which needs memory so you need the RAM.
> > > > > >
> > > > > > Secondly, 17000 rows isnt much to cause a region split. I dont
> know
> > > > exact
> > > > > > numbers but I had a table with 6 million rows and only 3 regions.
> > So,
> > > > > thats
> > > > > > not a big deal.
> > > > > >
> > > > > > Thirdly, try with upping the xceivers and ulimit and see if it
> > works
> > > > with
> > > > > > the existing RAM... Thats the only way out.
> > > > > >
> > > > > >
> > > > > > Amandeep Khurana
> > > > > > Computer Science Graduate Student
> > > > > > University of California, Santa Cruz
> > > > > >
> > > > > >
> > > > > > On Wed, Apr 8, 2009 at 12:02 AM, Rakhi Khatwani <
> > > > > [email protected]
> > > > > > >wrote:
> > > > > >
> > > > > > > Hi Amandeep,
> > > > > > >
> > > > > > > Following is my ec2 cluster configuration:
> > > > > > > High-CPU Medium Instance 1.7 GB of memory, 5 EC2 Compute Units
> (2
> > > > > virtual
> > > > > > > cores with 2.5 EC2 Compute Units each), 350 GB of instance
> > storage,
> > > > > > 32-bit
> > > > > > > platform
> > > > > > >
> > > > > > > so I don't think I have much option when it comes to the GB
> part.
> > > > > > iHowever,
> > > > > > > is there any way i can make use of 5ec2 compute units to
> increase
> > > my
> > > > > > > performance?
> > > > > > >
> > > > > > > Regarding the table splits, I dont see hbase doing the table
> > spilts
> > > > > > > automatically.
> > > > > > > After loading about 17000 rows in table1, I can still see it as
> > one
> > > > > > region
> > > > > > > (after checking it on web UI). thats why i had to manually
> split
> > > it.
> > > > or
> > > > > > is
> > > > > > > there any configuration/settings I have to do to ensure that
> the
> > > > tables
> > > > > > are
> > > > > > > split automatically?
> > > > > > >
> > > > > > > I will increase the dataXceivers and ulimit to 32k
> > > > > > >
> > > > > > > Thanks a ton
> > > > > > > Rakhi.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > Hi Amandeep,
> > > > > > > > >                  I have 1GB Memory on each node on ec2
> > > cluster(C1
> > > > > > > Medium)
> > > > > > > > .
> > > > > > > > > i am using hadoop-0.19.0 and hbase-0.19.0
> > > > > > > > > well we were starting with 10,000 rows, but later it will
> go
> > up
> > > > to
> > > > > > > > 100,000
> > > > > > > > > rows.
> > > > > > > >
> > > > > > > >
> > > > > > > > 1GB is too low. You need around 4GB to get a stable system.
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > my map task basically reads an hbase table 'Table1',
> performs
> > > > > > analysis
> > > > > > > on
> > > > > > > > > each row, and dumps the analysis results into another hbase
> > > table
> > > > > > > > 'Table2'.
> > > > > > > > > each analysis task takes about 3-4 minutes when tested on
> > local
> > > > > > machine
> > > > > > > > > (the
> > > > > > > > > algorithm part.... w/o the map reduce).
> > > > > > > > >
> > > > > > > > > i have divided 'Table1' to 30 regions b4 sending it to the
> > map.
> > > > and
> > > > > > set
> > > > > > > > the
> > > > > > > > > maximum number of map tasks to 20.
> > > > > > > >
> > > > > > > > Let hbase do the division into regions. Leave the table as it
> > is
> > > in
> > > > > > > default
> > > > > > > > state.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > i have set DataXceivers to 1024 and uLimit to 1024
> > > > > > > >
> > > > > > > > yes.. increase these..
> > > > > > > > 2048 dataxceivers and 32k ulimit.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > i am able to process about 300 rows in an hour which i feel
> > > quite
> > > > > > > slow...
> > > > > > > > > how do i increase the performance.
> > > > > > > >
> > > > > > > > the reaons are mentioned above.
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > meanwhile i will try settin the dataXceivers to 2048 and
> > > > increasing
> > > > > > the
> > > > > > > > > file
> > > > > > > > > limit as you mentioned.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Rakhi
> > > > > > > > >
> > > > > > > > > On Wed, Apr 8, 2009 at 11:40 AM, Amandeep Khurana <
> > > > > [email protected]>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > 20 nodes is good enough to begins with. How much memory
> do
> > > you
> > > > > have
> > > > > > > on
> > > > > > > > > each
> > > > > > > > > > node? IMO, you should keep 1GB per daemon and 1GB for the
> > MR
> > > > job
> > > > > > like
> > > > > > > > > > Andrew
> > > > > > > > > > suggested.
> > > > > > > > > > You dont necessarily have to separate the datanodes and
> > > > > > tasktrackers
> > > > > > > as
> > > > > > > > > > long
> > > > > > > > > > as you have enough resources.
> > > > > > > > > > 10000 rows isnt big at all from hbase standpoint. What
> kind
> > > of
> > > > > > > > > computation
> > > > > > > > > > are you doing before dumping data into hbase? And what
> > > versions
> > > > > of
> > > > > > > > Hadoop
> > > > > > > > > > and Hbase are you running?
> > > > > > > > > >
> > > > > > > > > > There's another thing you should do. Increase the
> > > DataXceivers
> > > > > > limit
> > > > > > > to
> > > > > > > > > > 2048
> > > > > > > > > > (thats what I use).
> > > > > > > > > >
> > > > > > > > > > If you have root privelege over the cluster, then
> increase
> > > the
> > > > > file
> > > > > > > > limit
> > > > > > > > > > to
> > > > > > > > > > 32k (see hbase faq for details).
> > > > > > > > > >
> > > > > > > > > > Try this out and see how it goes.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Amandeep Khurana
> > > > > > > > > > Computer Science Graduate Student
> > > > > > > > > > University of California, Santa Cruz
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Tue, Apr 7, 2009 at 2:45 AM, Rakhi Khatwani <
> > > > > > > > [email protected]
> > > > > > > > > > >wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi,
> > > > > > > > > > >      I have a 20 node cluster on ec2(small
> instance)....
> > i
> > > > have
> > > > > a
> > > > > > > set
> > > > > > > > > of
> > > > > > > > > > > tables which store huge amount of data (tried wid
> 10,000
> > > > > rows...
> > > > > > > more
> > > > > > > > > to
> > > > > > > > > > be
> > > > > > > > > > > added).... but during my map reduce jobs, some of the
> > > region
> > > > > > > servers
> > > > > > > > > shut
> > > > > > > > > > > down thereby causing data loss, stop in my program
> > > execution
> > > > > and
> > > > > > > > infact
> > > > > > > > > > one
> > > > > > > > > > > of my tables got damaged. when ever i scan the table, i
> > get
> > > > the
> > > > > > > could
> > > > > > > > > not
> > > > > > > > > > > obtain block error.
> > > > > > > > > > >
> > > > > > > > > > > 1. i want to make the cluster more robust. since it
> > > contains
> > > > a
> > > > > > lot
> > > > > > > of
> > > > > > > > > > data.
> > > > > > > > > > > and its really important that they remain stable.
> > > > > > > > > > > 2. if one of my tables gets damaged (even after
> > restarting
> > > > dfs
> > > > > n
> > > > > > > > > hbase),
> > > > > > > > > > > how
> > > > > > > > > > > do i go about recovering it?
> > > > > > > > > > >
> > > > > > > > > > > my ec2 cluster mostly has the default configuration.
> > > > > > > > > > > with hadoop-site n hbase-site have some entries
> > pertaining
> > > to
> > > > > > > > > map-reduce
> > > > > > > > > > > (for example. num of map tasks, mapred.task.timeout
> etc).
> > > > > > > > > > >
> > > > > > > > > > > Your help will be greatly appreciated.
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Raakhi Khatwani
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Region Servers going down frequently

Reply via email to