The following article about using Klout's Brickhouse library to access an HBase table as a map through its key might be useful. http://brickhouseconfessions.wordpress.com/2013/08/06/squash-the-long-tail-with-brickhouses-hbase-udfs/ On Jul 24, 2014 8:56 PM, "Andrew Mains" <[email protected]> wrote:
> Agreed--as far as I can tell there isn't any support for this currently. > > This JIRA (https://issues.apache.org/jira/browse/HIVE-3727, referenced in > http://hortonworks.com/blog/hbase-via-hive-part-1/) seems relevant, but > there's no recent work on it, and I imagine the patch included is out of > date with trunk. Perhaps it's worth resurrecting? > > Andrew > > On 7/24/14, 4:45 PM, java8964 wrote: > > I don't think Hbase-Hive integration part is that smart, be able to > utilize the index existing in the HBase. But I think it depends on the > version you are using. > > From my experience, there are a lot of improvement space in the > Hbase-hive integration, especially "push down" logic into HBase engine. > > Yong > > ------------------------------ > From: [email protected] > Date: Thu, 24 Jul 2014 14:03:42 -0700 > Subject: does the HBase-Hive integration support using HBase index > (primary key or secondary index) in the JOIN implementatoin? > To: [email protected] > > if I do a join of a table based on txt file and a table based on HBase, > and say the latter is very large, is HIVE smart enough to utilize the HBase > table's index to do the join, instead of implementing this as a regular map > reduce job, where each table is scanned fully, bucketed on join keys, and > then the matching items found out through the reducer? > > > thanks > Yang > > >
