Hi,

I am running a 3 node cluster. HDFS datanode and Hbase regionserver are
running on each node. The Hbase master and HDFS namenode run on
different machines (not "different" in the sense of "not in the
cluster". Just different in the sense of "not on the same box in the
cluster").Quad core, 64-bit JVM, 32 GB RAM. 4 disk per machine. We had
many troubles getting the cluster to stay alive when paired with an
asymmetric (big) mapreduce cluster that was writing into Hbase.
Ultimately, we achieved stability by disabling the WAL from code in our
mapreduce jobs, and setting the Hfile block size lower than the default
(we do a lot of random reads in the map phase). There are other tweaks
that must be made, such as upping the OS file limit. I made a lot of
posts in May, so you could look in the archive. At present, we're quite
happy with the cluster.

-geoff

-----Original Message-----
From: Paul Smith [mailto:[email protected]] 
Sent: Thursday, July 22, 2010 3:56 PM
To: [email protected]
Subject: Smallest production HBase cluster

anyone able to share their experience, thoughts on the 'smallest'
production HBase cluster in operation?    Thinking there may be some
point in the # Nodes scale where one transitions from/to "that's silly"
to "that's actually more like it".

Anyone out there with a small HBase cluster in operation with < 10 nodes
able to share any information?

I notice on http://wiki.apache.org/hadoop/Hbase/PoweredBy there are some
who have even just a 3 node cluster, perhaps that's out of date, but
curious to know from the community on where people think 'the line'
needs to be drawn on usage of Hbase.

To take things to an extreme, is there anyone actually running a
_single_ HBase node... ? (one would hope that machine is actually
designed to be a bit more HA than normal) just to take advantage of a
column-oriented store?

thanks,

Paul

Reply via email to