Neither right now -- I'm just assuming that it would be a problem since I would definitely have to support both in a hypothetical HBase+Hadoop installment that isn't actually built yet.
Did you ever try corralling those jobs by just reducing the number of available map/reduce tasks or did you find that that isn't a reliable throttling mechanism? Also, is replication to that batch cluster done via HBase replication or some other approach? On Thu, Sep 6, 2012 at 4:08 PM, Stack <[email protected]> wrote: > > On Wed, Sep 5, 2012 at 6:25 AM, Eric Czech <[email protected]> wrote: > > Hi everyone, > > > > Does anyone have any recommendations on how to maintain low latency for > > small, individual reads from HBase while MapReduce jobs are being run? Is > > replication a good way to handle this (i.e. run small, low-latency queries > > against a replicated copy of the data and run the MapReduce jobs on the > > master copy)? > > MapReduce is blowing your caches or higher i/o is sending up latency > when you have cache miss? Or its using all the CPU? > > Dependent on how its impinges, you could trying corralling mapreduce > (cgroups/jail) or go to an extreme and keep a low latency OLTP cluster > running well-known, well-behaved mapreduce jobs replicating into a > batch cluster where mapreduce is allowed free rein (This is what we do > where I work. We also cgroup mapreduce cluster even on our batch > cluster so random big MR doesn't make the pagers go off during sleepy > time). > > St.Ack
