Doesn't Hive for HBase enable joins?
On Tue, May 31, 2011 at 5:06 AM, Eran Kutner <[email protected]> wrote: > Hi, > I need to join two HBase tables. The obvious way is to use a M/R job for > that. The problem is that the few references to that question I found > recommend pulling one table to the mapper and then do a lookup for the > referred row in the second table. > This sounds like a very inefficient way to do join with map reduce. I > believe it would be much better to feed the rows of both tables to the > mapper and let it emit a key based on the join fields. Since all the rows > with the same join fields values will have the same key the reducer will be > able to easily generate the result of the join. > The problem with this is that I couldn't find a way to feed two tables to a > single map reduce job. I could probably dump the tables to files in a single > directory and then run the join on the files but that really makes no sense. > > Am I missing something? Any other ideas? > > -eran >
