Thanks a lot for your suggestions. On Wed, Nov 30, 2011 at 4:40 AM, Jahangir Mohammed <[email protected]>wrote:
> Using two separate clusters will be ideal. > > Thanks, > Jahangir. > > On Tue, Nov 29, 2011 at 1:31 PM, Jean-Daniel Cryans <[email protected] > >wrote: > > > At StumbleUpon we use two clusters because high throughput (like > > MapReduce) will always kill low latency (like serving random reads). > > > > J-D > > > > On Mon, Nov 28, 2011 at 11:37 PM, jingguo yao <[email protected]> > > wrote: > > > I want to set up Hadoop clusters. There are two workloads. One is log > > > analysis which is using MapReduce to process big log files in HDFS. > > > The other is HBase which is used to serve random table queries. > > > > > > I have two choices to set up my Hadoop clusters. One is to use one > > > Hadoop cluster. Log analysis and HBase use the same cluster. Its > > > advantages are: > > > > > > 1 There is only one Hadoop cluster which I need to manage. > > > 2 Both MapReduce and HBase can use this big cluster which has more > > > storage and more powerful computation capability. > > > > > > Its disadvantages: > > > > > > 1 Running MapReduce jobs may slow down the random HBase table > > > queries. > > > > > > The other choice is to use two clusters. Cluster A is for log analysis. > > > Cluster B is for HBase. Its advantages are: > > > > > > 1 There are no interferences between log analysis and HBase table > > > queries. > > > > > > Its disadvantages: > > > > > > 1. There are two Hadoop clusters which need to be managed. > > > 2. Both log analysis and HBase queries can only use a small Hadoop > > > cluster which has less storage and less powerful computation > > > capability. > > > > > > I don't know which choice is better. Can anybody give me some advice > > > on this? Thanks. > > > > > > -- > > > Jingguo > > > > > > -- Jingguo
