Thanks a lot for your suggestions.

On Wed, Nov 30, 2011 at 4:40 AM, Jahangir Mohammed
<[email protected]>wrote:

> Using two separate clusters will be ideal.
>
> Thanks,
> Jahangir.
>
> On Tue, Nov 29, 2011 at 1:31 PM, Jean-Daniel Cryans <[email protected]
> >wrote:
>
> > At StumbleUpon we use two clusters because high throughput (like
> > MapReduce) will always kill low latency (like serving random reads).
> >
> > J-D
> >
> > On Mon, Nov 28, 2011 at 11:37 PM, jingguo yao <[email protected]>
> > wrote:
> > > I want to set up Hadoop clusters. There are two workloads. One is log
> > > analysis which is using MapReduce to process big log files in HDFS.
> > > The other is HBase which is used to serve random table queries.
> > >
> > > I have two choices to set up my Hadoop clusters. One is to use one
> > > Hadoop cluster. Log analysis and HBase use the same cluster. Its
> > > advantages are:
> > >
> > > 1 There is only one Hadoop cluster which I need to manage.
> > > 2 Both MapReduce and HBase can use this big cluster which has more
> > >  storage and more powerful computation capability.
> > >
> > > Its disadvantages:
> > >
> > > 1 Running MapReduce jobs may slow down the random HBase table
> > >  queries.
> > >
> > > The other choice is to use two clusters. Cluster A is for log analysis.
> > > Cluster B is for HBase. Its advantages are:
> > >
> > > 1 There are no interferences between log analysis and HBase table
> > >  queries.
> > >
> > > Its disadvantages:
> > >
> > > 1. There are two Hadoop clusters which need to be managed.
> > > 2. Both log analysis and HBase queries can only use a small Hadoop
> > >   cluster which has less storage and less powerful computation
> > >   capability.
> > >
> > > I don't know which choice is better. Can anybody give me some advice
> > > on this? Thanks.
> > >
> > > --
> > > Jingguo
> > >
> >
>



-- 
Jingguo

Reply via email to