Using external indexes in an HBase Map/Reduce job...

Michael Segel Tue, 12 Oct 2010 05:36:55 -0700

Hi,

Now I realize that most everyone is sitting in NY, while some of us can't leave 
our respective cities....


Came across this problem and I was wondering how others solved it.

Suppose you have a really large table with 1 billion rows of data. 
Since HBase really doesn't have any indexes built in (Don't get me started 
about the contrib/transactional stuff...), you're forced to use some sort of 
external index, or roll your own index table.

The net result is that you end up with a list object that contains your result 
set.

So the question is... what's the best way to feed the list object in?

One option I thought about is writing the object to a file and then using it as 
the file in and then control the splitters. Not the most efficient but it would 
work.

Was trying to find a more 'elegant' solution and I'm sure that anyone using 
SOLR or LUCENE or whatever... had come across this problem too.

Any suggestions? 

Thx

Using external indexes in an HBase Map/Reduce job...

Reply via email to