I'm taking a look at moving our datastore from Oracle to HBase, and trying to understand how HBase could be used for ad-hoc aggregation queries across our data.
My understanding is MapReduce is more of a batch framework, so if we want a query to come back to the user's request in a few seconds, that won't work because of the overheard of running MR and because the MR jobs write back to a new table. Is that correct? Instead should we be pre-aggregating data as we load into separate tables, and then when a user queries instead just do a scan on these pre-aggregated tables? Thanks. -- View this message in context: http://old.nabble.com/HBase-for-ad-hoc-aggregate-queries-tp33123313p33123313.html Sent from the HBase User mailing list archive at Nabble.com.
