On Wed, Apr 14, 2010 at 9:20 PM, Geoff Hendrey <ghend...@decarta.com> wrote:
> Thanks for your help. See answers below. > > -----Original Message----- > From: saint....@gmail.com [mailto:saint....@gmail.com] On Behalf Of Stack > Sent: Wednesday, April 14, 2010 8:45 PM > To: hbase-user@hadoop.apache.org > Cc: Paul Mahon; Bill Brune; Shaheen Bahauddin; Rohit Nigam > Subject: Re: Region server goes away > > On Wed, Apr 14, 2010 at 8:27 PM, Geoff Hendrey <ghend...@decarta.com> > wrote: > > Hi, > > > > I have posted previously about issues I was having with HDFS when I > > was running HBase and HDFS on the same box both pseudoclustered. Now I > > have two very capable servers. I've setup HDFS with a datanode on each > box. > > I've setup the namenode on one box, and the zookeeper and HDFS master > > on the other box. Both boxes are region servers. I am using hadoop > > 20.2 and hbase 20.3. > > What do you have for replication? If two datanodes, you've set it to two > rather than default 3? > Geoff: I didn't change the default, so it was 3. I will change it to 2 > moving forward. Actually, for now I am going to make it 1. For initial test > runs I don't see why I need replication at all. > > > > > > I have set dfs.datanode.socket.write.timeout to 0 in hbase-site.xml. > > > This is probably not necessary. > > > > I am running a mapreduce job with about 200 concurrent reducers, each > > of which writes into HBase, with 32,000 row flush buffers. > Do you really mean 200 concurrent reducers?? That is to say 100 reducers per box? I would recommend that only if you have a 100+ core machine... not likely. FYI typical values for reduce slots on dual quad core Nehalem with hyperthreading (ie 16 logical cores) are in the range of 8-10, not 100! -Todd -- Todd Lipcon Software Engineer, Cloudera