Re: Blog post about when to use HBase

Bryan Duxbury Tue, 13 May 2008 10:21:10 -0700

I think that the determining factor of when you should use HBaseinstead of HDFS files is really the consumption pattern. If you'reonly ever going to process the data in bulk, then chances are you'llget the most performance out of a raw HDFS file. However, if you needto have random access to some of the entries, then HBase will giveyou significant benefit.

There are other factors that go into this decision. One that I canthink of off the top of my head is if you'd like to take advantage ofthe versioning and semi-defined schema of HBase for your dataset. Itwould be a little complicated to duplicate all of that logic on yourown from a flat file.

Another factor is your system's workflow. If you use HDFS files, youneed to be ok with always rewriting the files to do any "updates". Soeven if you only add 1MB worth of new data to a 1TB dataset, you haveto rewrite the whole thing. HBase would let you "insert" it where itbelongs. (Of course, HBase has the same constraints as yourapplications do, except we've already done the work to manage randominserts.)


Does this help you out?

-Bryan

On May 13, 2008, at 10:13 AM, Naama Kraus wrote:

Hi,
Can anyone say some words on when to use HBase as opposed to usingPlain
MapReduce on input files ?
In more details, when will it make sense to put data into HBase andthen useHBase methods to access it, including running MapReduce on the datain thetables. As opposed to simply putting the data into HDFS andprocessing it
with MapReduce.

Thanks, Naama
On Wed, Mar 12, 2008 at 12:15 AM, Bryan Duxbury <[EMAIL PROTECTED]>wrote:
I've written up a blog post discussing when I think it'sappropriate touse HBase in response to some of the questions people usually ask.You can
find it at http://blog.rapleaf.com/dev/?p=26.

-Bryan
--
oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00oo 00 oo
00 oo 00 oo
"If you want your children to be intelligent, read them fairytales. If you
want them to be more intelligent, read them more fairy tales." (Albert
Einstein)

Re: Blog post about when to use HBase

Reply via email to