Re: Region Servers going down frequently

Ryan Rawson Wed, 08 Apr 2009 00:37:57 -0700

Just FYI, 0.20 handles small cell values substantially better than 0.19.1.

-ryan


On Wed, Apr 8, 2009 at 12:35 AM, Amandeep Khurana <[email protected]> wrote:

> Hadoop and hbase are intelligent enough to balance the load. Its not very
> frequent that you need to balance the load manually. Your cluster isnt
> performing because of the low memory and the low limits on top of it. I
> dont
> think the load is a problem at all.
>
> Hadoop and Hbase are not designed for small data sizes and therefore dont
> have the best performance when you have small files or small tables. The
> most difficult part of hbase is starting up and increasing the table to a
> certain threshold level. You'll encounter troubles in that phase (which you
> already are). After that, its a breeze...
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>
>
> On Wed, Apr 8, 2009 at 12:29 AM, Rakhi Khatwani <[email protected]
> >wrote:
>
> > Thanks, Amandeep
> >
> > One more question, i have mailed it earlier and i have attached the
> > snapshot
> > along with that email.
> > I have noticed it that all my requests are handled by one region
> server...
> > Is there any way to balance the load?
> > and will balancing the load improve the performance?
> >
> > PS: I have tried using hadoop load balancing but after some time some of
> my
> > region servers shut down... i have even gone through the archives and
> > someone did report an unstable cluster due to load balancing. so i really
> > dont know if i should turn load balancing on?
> >
> > Thanks,
> > Raakhi
> >
> > On Wed, Apr 8, 2009 at 12:51 PM, Amandeep Khurana <[email protected]>
> > wrote:
> >
> > > I'm not sure if I can answer that correctly or not. But my guess is no
> it
> > > wont hamper the performance.
> > >
> > >
> > > Amandeep Khurana
> > > Computer Science Graduate Student
> > > University of California, Santa Cruz
> > >
> > >
> > > On Wed, Apr 8, 2009 at 12:13 AM, Rakhi Khatwani <
> > [email protected]
> > > >wrote:
> > >
> > > > Hi Amandeep,
> > > >
> > > > But in That case, if I let hbase split it automatically, my table
> with
> > > > 17000
> > > > rows will have only one region. thus my analysis will have only one
> > map.
> > > > won't the analysis process be slower in that case??
> > > >
> > > > Thanks,
> > > > Raakhi
> > > >
> > > > On Wed, Apr 8, 2009 at 12:35 PM, Amandeep Khurana <[email protected]>
> > > > wrote:
> > > >
> > > > > You cant compensate the RAM with processing power. Hbase keeps a
> lot
> > of
> > > > > open
> > > > > file handles in hdfs which needs memory so you need the RAM.
> > > > >
> > > > > Secondly, 17000 rows isnt much to cause a region split. I dont know
> > > exact
> > > > > numbers but I had a table with 6 million rows and only 3 regions.
> So,
> > > > thats
> > > > > not a big deal.
> > > > >
> > > > > Thirdly, try with upping the xceivers and ulimit and see if it
> works
> > > with
> > > > > the existing RAM... Thats the only way out.
> > > > >
> > > > >
> > > > > Amandeep Khurana
> > > > > Computer Science Graduate Student
> > > > > University of California, Santa Cruz
> > > > >
> > > > >
> > > > > On Wed, Apr 8, 2009 at 12:02 AM, Rakhi Khatwani <
> > > > [email protected]
> > > > > >wrote:
> > > > >
> > > > > > Hi Amandeep,
> > > > > >
> > > > > > Following is my ec2 cluster configuration:
> > > > > > High-CPU Medium Instance 1.7 GB of memory, 5 EC2 Compute Units (2
> > > > virtual
> > > > > > cores with 2.5 EC2 Compute Units each), 350 GB of instance
> storage,
> > > > > 32-bit
> > > > > > platform
> > > > > >
> > > > > > so I don't think I have much option when it comes to the GB part.
> > > > > iHowever,
> > > > > > is there any way i can make use of 5ec2 compute units to increase
> > my
> > > > > > performance?
> > > > > >
> > > > > > Regarding the table splits, I dont see hbase doing the table
> spilts
> > > > > > automatically.
> > > > > > After loading about 17000 rows in table1, I can still see it as
> one
> > > > > region
> > > > > > (after checking it on web UI). thats why i had to manually split
> > it.
> > > or
> > > > > is
> > > > > > there any configuration/settings I have to do to ensure that the
> > > tables
> > > > > are
> > > > > > split automatically?
> > > > > >
> > > > > > I will increase the dataXceivers and ulimit to 32k
> > > > > >
> > > > > > Thanks a ton
> > > > > > Rakhi.
> > > > > >
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > > Hi Amandeep,
> > > > > > > >                  I have 1GB Memory on each node on ec2
> > cluster(C1
> > > > > > Medium)
> > > > > > > .
> > > > > > > > i am using hadoop-0.19.0 and hbase-0.19.0
> > > > > > > > well we were starting with 10,000 rows, but later it will go
> up
> > > to
> > > > > > > 100,000
> > > > > > > > rows.
> > > > > > >
> > > > > > >
> > > > > > > 1GB is too low. You need around 4GB to get a stable system.
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > my map task basically reads an hbase table 'Table1', performs
> > > > > analysis
> > > > > > on
> > > > > > > > each row, and dumps the analysis results into another hbase
> > table
> > > > > > > 'Table2'.
> > > > > > > > each analysis task takes about 3-4 minutes when tested on
> local
> > > > > machine
> > > > > > > > (the
> > > > > > > > algorithm part.... w/o the map reduce).
> > > > > > > >
> > > > > > > > i have divided 'Table1' to 30 regions b4 sending it to the
> map.
> > > and
> > > > > set
> > > > > > > the
> > > > > > > > maximum number of map tasks to 20.
> > > > > > >
> > > > > > > Let hbase do the division into regions. Leave the table as it
> is
> > in
> > > > > > default
> > > > > > > state.
> > > > > > >
> > > > > > > >
> > > > > > > > i have set DataXceivers to 1024 and uLimit to 1024
> > > > > > >
> > > > > > > yes.. increase these..
> > > > > > > 2048 dataxceivers and 32k ulimit.
> > > > > > >
> > > > > > > >
> > > > > > > > i am able to process about 300 rows in an hour which i feel
> > quite
> > > > > > slow...
> > > > > > > > how do i increase the performance.
> > > > > > >
> > > > > > > the reaons are mentioned above.
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > meanwhile i will try settin the dataXceivers to 2048 and
> > > increasing
> > > > > the
> > > > > > > > file
> > > > > > > > limit as you mentioned.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Rakhi
> > > > > > > >
> > > > > > > > On Wed, Apr 8, 2009 at 11:40 AM, Amandeep Khurana <
> > > > [email protected]>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > 20 nodes is good enough to begins with. How much memory do
> > you
> > > > have
> > > > > > on
> > > > > > > > each
> > > > > > > > > node? IMO, you should keep 1GB per daemon and 1GB for the
> MR
> > > job
> > > > > like
> > > > > > > > > Andrew
> > > > > > > > > suggested.
> > > > > > > > > You dont necessarily have to separate the datanodes and
> > > > > tasktrackers
> > > > > > as
> > > > > > > > > long
> > > > > > > > > as you have enough resources.
> > > > > > > > > 10000 rows isnt big at all from hbase standpoint. What kind
> > of
> > > > > > > > computation
> > > > > > > > > are you doing before dumping data into hbase? And what
> > versions
> > > > of
> > > > > > > Hadoop
> > > > > > > > > and Hbase are you running?
> > > > > > > > >
> > > > > > > > > There's another thing you should do. Increase the
> > DataXceivers
> > > > > limit
> > > > > > to
> > > > > > > > > 2048
> > > > > > > > > (thats what I use).
> > > > > > > > >
> > > > > > > > > If you have root privelege over the cluster, then increase
> > the
> > > > file
> > > > > > > limit
> > > > > > > > > to
> > > > > > > > > 32k (see hbase faq for details).
> > > > > > > > >
> > > > > > > > > Try this out and see how it goes.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Amandeep Khurana
> > > > > > > > > Computer Science Graduate Student
> > > > > > > > > University of California, Santa Cruz
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Tue, Apr 7, 2009 at 2:45 AM, Rakhi Khatwani <
> > > > > > > [email protected]
> > > > > > > > > >wrote:
> > > > > > > > >
> > > > > > > > > > Hi,
> > > > > > > > > >      I have a 20 node cluster on ec2(small instance)....
> i
> > > have
> > > > a
> > > > > > set
> > > > > > > > of
> > > > > > > > > > tables which store huge amount of data (tried wid 10,000
> > > > rows...
> > > > > > more
> > > > > > > > to
> > > > > > > > > be
> > > > > > > > > > added).... but during my map reduce jobs, some of the
> > region
> > > > > > servers
> > > > > > > > shut
> > > > > > > > > > down thereby causing data loss, stop in my program
> > execution
> > > > and
> > > > > > > infact
> > > > > > > > > one
> > > > > > > > > > of my tables got damaged. when ever i scan the table, i
> get
> > > the
> > > > > > could
> > > > > > > > not
> > > > > > > > > > obtain block error.
> > > > > > > > > >
> > > > > > > > > > 1. i want to make the cluster more robust. since it
> > contains
> > > a
> > > > > lot
> > > > > > of
> > > > > > > > > data.
> > > > > > > > > > and its really important that they remain stable.
> > > > > > > > > > 2. if one of my tables gets damaged (even after
> restarting
> > > dfs
> > > > n
> > > > > > > > hbase),
> > > > > > > > > > how
> > > > > > > > > > do i go about recovering it?
> > > > > > > > > >
> > > > > > > > > > my ec2 cluster mostly has the default configuration.
> > > > > > > > > > with hadoop-site n hbase-site have some entries
> pertaining
> > to
> > > > > > > > map-reduce
> > > > > > > > > > (for example. num of map tasks, mapred.task.timeout etc).
> > > > > > > > > >
> > > > > > > > > > Your help will be greatly appreciated.
> > > > > > > > > > Thanks,
> > > > > > > > > > Raakhi Khatwani
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Region Servers going down frequently

Reply via email to