re: "My understanding is MapReduce is more of a batch framework,"
Yes. re: "and because the MR jobs write back to a new table." They can write to where-ever they need to write (HDFS, Hbase, etc.) Probably want to check out the Hbase Book/RefGuide on the Architecture, DataModel, and MapReduce chapters. http://hbase.apache.org/book.html On 1/11/12 1:59 PM, "kfarmer" <[email protected]> wrote: > >I'm taking a look at moving our datastore from Oracle to HBase, and >trying to >understand how HBase could be used for ad-hoc aggregation queries across >our >data. > >My understanding is MapReduce is more of a batch framework, so if we want >a >query to come back to the user's request in a few seconds, that won't work >because of the overheard of running MR and because the MR jobs write back >to >a new table. Is that correct? > >Instead should we be pre-aggregating data as we load into separate tables, >and then when a user queries instead just do a scan on these >pre-aggregated >tables? > >Thanks. >-- >View this message in context: >http://old.nabble.com/HBase-for-ad-hoc-aggregate-queries-tp33123313p331233 >13.html >Sent from the HBase User mailing list archive at Nabble.com. > >
