RE: Region server goes away

Geoff Hendrey Thu, 15 Apr 2010 10:23:35 -0700

I have a 12-server mapreduce cluster. It is a totally different set of
servers from the two servers that are hosting my Hbase cluster. In the
12-server mapreduce cluster, we easily run 200 concurrent tasks
distributed over the 12 boxes.


-g 

-----Original Message-----
From: Todd Lipcon [mailto:t...@cloudera.com] 
Sent: Wednesday, April 14, 2010 11:19 PM
To: hbase-user@hadoop.apache.org
Cc: Paul Mahon; Bill Brune; Shaheen Bahauddin; Rohit Nigam
Subject: Re: Region server goes away

On Wed, Apr 14, 2010 at 9:20 PM, Geoff Hendrey <ghend...@decarta.com>
wrote:

> Thanks for your help. See answers below.
>
> -----Original Message-----
> From: saint....@gmail.com [mailto:saint....@gmail.com] On Behalf Of 
> Stack
> Sent: Wednesday, April 14, 2010 8:45 PM
> To: hbase-user@hadoop.apache.org
> Cc: Paul Mahon; Bill Brune; Shaheen Bahauddin; Rohit Nigam
> Subject: Re: Region server goes away
>
> On Wed, Apr 14, 2010 at 8:27 PM, Geoff Hendrey <ghend...@decarta.com>
> wrote:
> > Hi,
> >
> > I have posted previously about issues I was having with HDFS when I 
> > was running HBase and HDFS on the same box both pseudoclustered. Now

> > I have two very capable servers. I've setup HDFS with a datanode on 
> > each
> box.
> > I've setup the namenode on one box, and the zookeeper and HDFS 
> > master on the other box. Both boxes are region servers. I am using 
> > hadoop
> > 20.2 and hbase 20.3.
>
> What do you have for replication?  If two datanodes, you've set it to 
> two rather than default 3?
> Geoff: I didn't change the default, so it was 3. I will change it to 2

> moving forward. Actually, for now I am going to make it 1. For initial

> test runs I don't see why I need replication at all.
>
>
> >
> > I have set dfs.datanode.socket.write.timeout to 0 in hbase-site.xml.
> >
> This is probably not necessary.
>
>
> > I am running a mapreduce job with about 200 concurrent reducers, 
> > each of which writes into HBase, with 32,000 row flush buffers.
>

 Do you really mean 200 concurrent reducers?? That is to say 100
reducers per box? I would recommend that only if you have a 100+ core
machine... not likely.

FYI typical values for reduce slots on dual quad core Nehalem with
hyperthreading (ie 16 logical cores) are in the range of 8-10, not 100!

-Todd


--
Todd Lipcon
Software Engineer, Cloudera

RE: Region server goes away

Reply via email to