U can have a look at opentsdb which does aggregations on the data: http://opentsdb.net/ Also, you can use endpoint coprocessors to do aggregations on a per region and then merge the results. http://hbase-coprocessor-experiments.blogspot.com/2011/05/extending.html
Both of these approaches will give you alternatives apart from traditional MR. On Thu, Jan 12, 2012 at 12:29 AM, kfarmer <[email protected]> wrote: > > I'm taking a look at moving our datastore from Oracle to HBase, and trying > to > understand how HBase could be used for ad-hoc aggregation queries across > our > data. > > My understanding is MapReduce is more of a batch framework, so if we want a > query to come back to the user's request in a few seconds, that won't work > because of the overheard of running MR and because the MR jobs write back > to > a new table. Is that correct? > > Instead should we be pre-aggregating data as we load into separate tables, > and then when a user queries instead just do a scan on these pre-aggregated > tables? > > Thanks. > -- > View this message in context: > http://old.nabble.com/HBase-for-ad-hoc-aggregate-queries-tp33123313p33123313.html > Sent from the HBase User mailing list archive at Nabble.com. > >
