RE: Why is scaling HBase much simpler then scaling a relational db?

Jim Kellerman Thu, 21 Aug 2008 09:12:27 -0700

Comments inline:
> -----Original Message-----
> From: Mork0075 [mailto:[EMAIL PROTECTED]
> Sent: Thursday, August 21, 2008 8:48 AM
> To: [EMAIL PROTECTED]; hbase-user@hadoop.apache.org
> Subject: Re: Why is scaling HBase much simpler then scaling a relational db?
>
> Thank you, but i still don't got it.
>
> I've read tons of websites and papers, but there's no clear und founded
> answer "why use BigTable instead of relational databases".
>
> MySQL Cluster seams to offer the same scalabilty and level of
> abstraction, whithout switching to a non relational pardigm. Lots of
> blog posts are highly emotional, without answering the core question:


I think you'd find that when the size of your data approaches 10-100 TB, you'd 
find that relational databases run out of gas. Further, as your data grows, 
with a relational database you need to add another shard, redistribute your 
data and make the client know that rows are split over n+1 shards instead of n.

Bigtable has shown that it can scale to 100s of TB of data (or even more - I 
don't have any recent numbers on the largest Bigtable instance. All this can be 
done by just bringing up a new server and data is redistributed automatically, 
and client applications do not need to be changed.

> "Why RDBMS don't scale and why something like BigTable do". Often you
> read something like this:
>
> "They have also built a system called BigTable, which is a Column
> Oriented Database, which splits a table into columns rather than rows
> making is much simpler to distribute and parallelize."
>
> Why?

In a column oriented data store, nulls are free. Not so for a row oriented 
database, where it must allocate space for a column even if the current value 
is null.

RE: Why is scaling HBase much simpler then scaling a relational db?

Reply via email to