This looks to be exactly what I need. Thanks :) ----- Original Message -----
From: "Sonal Goyal" <[email protected]> To: [email protected] Sent: Monday, July 25, 2011 12:03:30 AM Subject: Re: Fanning out hbase queries in parallel Hi Paul, Have you taken a look at HBase coprocessors? I think you will find them useful. Best Regards, Sonal <https://github.com/sonalgoyal/hiho>Hadoop ETL and Data Integration<https://github.com/sonalgoyal/hiho> Nube Technologies <http://www.nubetech.co> <http://in.linkedin.com/in/sonalgoyal> On Mon, Jul 25, 2011 at 8:13 AM, Paul Nickerson <[email protected] > wrote: > > I would like to implement a multidimensional query system that aggregates > large amounts of data on-the-fly by fanning out queries in parallel. It > should be fast enough for interactive exploration of the data and extensible > enough to take sets of hundreds or thousands of dimensions with high > cardinality, and aggregate them from high granularity to low granularity. > Dimensions and their values are stored in the row key. For instance, row > keys look like this > Foo=bar,blah=123 > and each row contains numerical values within their column families, such > as plays=100, versioned by the date of calculation. > User wants the top "Foo" values with blah=123 sorted downward by total > plays in july. My current thinking is that a query would get executed by > grouping all Foo-prefixed row keys by region server, and send the query to > each of those. Each region server iterates through all of it's row keys that > start with Foo=something,blah=, and passes the query on to all regions > containing blahs that equal 123, which then contain play counts. Matching > row keys, as well as the sum of all their play values within july, are > passed back up the chain and sorted/truncated when possible. > > > It seems quite complicated and would involve either modifying hbase source > code or at the very least using the deep internals of the api. Does this > seem like a practical solution or could someone offer some ideas? > > > Thank you!
