Evaluating HBase

Charles Kaminski Mon, 04 Feb 2008 14:45:18 -0800

Hi All,

I am evaluating HBase and I am not sure if our
use-case fits naturally with HBases capabilities.  I
would appreciate any help.


We would like to store a large number (billions) of
rows in HBase using a key field to access the values. 
We will then need to continually add, update, and
delete rows.  This is our master table.  What I
describe here naturally fits into what HBase is
designed to do.

Its this next part that Im having trouble finding
documentation for.

We would like to use HBases parallel processing
capabilities to periodically spawn off other temporary
tables when requested.  We would like to take the
first table (the master table), go through the key and
field values in its rows.  From this, we would like to
create a second table organized differently from the
master table.  We would also need to include count,
max, min, and other things specific to the particular
request. 

This seems like textbook map-reduce functionality, but
I dont see too much in HBase referencing this kind of
setup.  Also there is a reference in HBases 10 minute
startup guide that states [HBase doesnt] need
mapreduce.

I suppose we could use HBase as an input and output to
Hadoop's map reduce functionality.  If we did that,
what would guarantee that we were mapping to local
data?

Any help would be greatly appreciated.  If you have a
reference to a previous discussion or document I could
read, that would be appreciated as well.

-FA



      
____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page. 
http://www.yahoo.com/r/hs

Evaluating HBase

Reply via email to