Re: Managing MapReduce jobs with concurrent client reads

Stack Thu, 06 Sep 2012 13:08:49 -0700

On Wed, Sep 5, 2012 at 6:25 AM, Eric Czech <[email protected]> wrote:
> Hi everyone,
>
> Does anyone have any recommendations on how to maintain low latency for
> small, individual reads from HBase while MapReduce jobs are being run?  Is
> replication a good way to handle this (i.e. run small, low-latency queries
> against a replicated copy of the data and run the MapReduce jobs on the
> master copy)?


MapReduce is blowing your caches or higher i/o is sending up latency
when you have cache miss?  Or its using all the CPU?

Dependent on how its impinges, you could trying corralling mapreduce
(cgroups/jail) or go to an extreme and keep a low latency OLTP cluster
running well-known, well-behaved mapreduce jobs replicating into a
batch cluster where mapreduce is allowed free rein (This is what we do
where I work.  We also cgroup mapreduce cluster even on our batch
cluster so random big MR doesn't make the pagers go off during sleepy
time).

St.Ack

Re: Managing MapReduce jobs with concurrent client reads

Reply via email to