On Tue, Jun 5, 2012 at 8:29 PM, Atif Khan <[email protected]> wrote: > My first thoughts were to create a single HDFS cluster, and then point the > MapReduce and HBase servers to use the common HDFS installation. However, > Cloudera's Dos and Don'ts page > (http://www.cloudera.com/blog/2011/04/hbase-dos-and-donts/) insists that > MapReduce and HBase should not share an HDFS cluster. Rather they should > have their own individual clusters. I don't understand this recommendation, > as it would result in moving data around from one HDFS cluster to another > when running MapReduce over HBase. >
It starts out "Be careful when running mixed workloads on an HBase cluster." Does your use case fit the case described: "...SLAs on hbase access" and at the same time running heavy mapreduce jobs on same cluster? If so, you may want to do the suggested two clusters. I'd suggest you start w/ all on the one cluster and see how you do. That post is > a year old. HBase has gotten steadily better since. St.Ack
