Use single cluster or two clusters for log analysis and HBase?

jingguo yao Mon, 28 Nov 2011 23:37:40 -0800

I want to set up Hadoop clusters. There are two workloads. One is log
analysis which is using MapReduce to process big log files in HDFS.
The other is HBase which is used to serve random table queries.


I have two choices to set up my Hadoop clusters. One is to use one
Hadoop cluster. Log analysis and HBase use the same cluster. Its
advantages are:

1 There is only one Hadoop cluster which I need to manage.
2 Both MapReduce and HBase can use this big cluster which has more
  storage and more powerful computation capability.

Its disadvantages:

1 Running MapReduce jobs may slow down the random HBase table
  queries.

The other choice is to use two clusters. Cluster A is for log analysis.
Cluster B is for HBase. Its advantages are:

1 There are no interferences between log analysis and HBase table
  queries.

Its disadvantages:

1. There are two Hadoop clusters which need to be managed.
2. Both log analysis and HBase queries can only use a small Hadoop
   cluster which has less storage and less powerful computation
   capability.

I don't know which choice is better. Can anybody give me some advice
on this? Thanks.

-- 
Jingguo

Use single cluster or two clusters for log analysis and HBase?

Reply via email to