Adding a tiny HBase cluster to existing Hadoop environment

Tatsuya Kawano Thu, 03 Jun 2010 17:07:20 -0700

Hello,

I remember Jon was talking other day that he was trying a single HBaseserver with existing HDFS cluster to serve map reduce (MR) results. Iwonder if this went well or not.

A couple of friends in Tokyo are considering HBase to do a similarthing. They want to serve MR results inside the clients' companies viaHBase. They both have existing MR/HDFS emvironment; one has a small (<10) and another has a large (> 50) clusters.

They'll use the incremental loading to existing table (HBASE-1923) toadd the MR results to the HBase table, and only few users will readand export (web CSV download) the results via HBase. So HBase will belightly loaded. They probably won't even need high availability (HA)option on HBase.

So I'm thinking to recommend them to add just one server (non-HA) ortwo servers (HA) to their Hadoop cluster, and run only HMaster andRegion Server processes on the server(s). The HBase cluster willutilize the existing (small or large) HDFS cluster and ZooKeeperensemble.

The server spec will be 2 x 8-core processors and 8GB to 24GB RAM. TheRAM size will be change depending on the data volume and access pattern.


Has anybody tried a similar configuration? and how it goes?

Also, I saw Jon's slides for Hadoop World in NYC 2009, and it was saidthat I'd better to have at least 5 Region Servers / Data Nodes in mycluster to get the typical performance. If I deploy RS and DN onseparate servers, which one should be >= 5 nodes? DN? RS? or both?



Thanks,
Tatsuya Kawano
Tokyo, Japan

Adding a tiny HBase cluster to existing Hadoop environment

Reply via email to