Re: Why is scaling HBase much simpler then scaling a relational db?

Mork0075 Thu, 21 Aug 2008 10:13:12 -0700

Thanks a lot for all replies, this is really helpful.

As you describe it, its a problem of implementation. BigTable isdesigned to scale, there are routines to shard the data, desitribute itto the pool of connected servers. Could MySQL perhaps decide tomorrow toimplement something similar or does the relational model avoids this?

As i can see up to here, scalability in HBase is for free, where there'sno adaquate pre-implemented solution for a (free) relational databasesoftware.



Fernando Padilla schrieb:

I'm no expert, but maybe I can explain it the way I see it, maybe itwill resonate with other newbies like me :) Sorry if it's long winded,or boring for those who already know all this.
BigTable and Hadoop are inherently sharded and distributed. They arearchitected to store the data in redundant shards across many machines.This allows you to add more capacity ( both in processing band-width aswell as storage capacity ), simply by adding more machines to host yourcluster.
Mysql is implemented around the idea of a single database running on asingle machine. This isn't inherently bad, as long as you have amachine with large enough storage and processing bandwidth ( memory/cpu). You can implement sharding over mysql (and other databases), but theapplication will have to be hand tailored to work in such a setup.You're essentially implemented a home-grown data storage system, from acollection of regular databases.
You mentioned the Mysql Cluster technology; which is an attempt to bringin Mysql native level sharding and distribution of processing bandwidth.But at least from their early architecture (I have not kept up with anyevolution of it), it was not yet close to real sharding nor fullscalability. The technology could be described as a collection ofMaster-Master In-Memory database nodes backed by a collection ofpersistence nodes. So that it worked great as long as you had enoughram to hold your whole database in a single node.
Every system has their pros and cons.
A single Mysql is simpler and solid and people have lots of experiencewith it. If your application allows, and with some application specificarchitecting, you could shard mysql to some high level of scalability.But this really depends on your application data and queries you do overthat data. Or how much extra engineering you're willing to do (andmaintain) to queries that span multiple shards run efficiently.
BigTable and Hadoop are implemented to support sharding and distributedqueries from the get-go, so you can easily scale out without having toadd or maintain more complex or homegrown software, just more hardware.
Mork0075 wrote:
Thank you, but i still don't got it.
I've read tons of websites and papers, but there's no clear undfounded answer "why use BigTable instead of relational databases".
MySQL Cluster seams to offer the same scalabilty and level ofabstraction, whithout switching to a non relational pardigm. Lots ofblog posts are highly emotional, without answering the core question:
"Why RDBMS don't scale and why something like BigTable do". Often youread something like this:
"They have also built a system called BigTable, which is a ColumnOriented Database, which splits a table into columns rather than rowsmaking is much simpler to distribute and parallelize."
Why?

Really confusing ... ;)

Stuart Sierra schrieb:
On Tue, Aug 19, 2008 at 9:44 AM, Mork0075 <[EMAIL PROTECTED]>wrote:
Can you please explain, why someone should use HBase for horizontal
scaling instead of a relational database? One reason for me would be,
that i don't have to implement the sharding logic myself. Are thereother?
A slight tangent -- there are various tools that implement sharding
over relational databases like MySQL.  Two that I know of are
DBSlayer,
http://code.nytimes.com/projects/dbslayer
and MySQL Proxy,
http://forge.mysql.com/wiki/MySQL_Proxy

I don't know of any formal comparisons between sharding traditional
database servers and distributed databases like HBase.
-Stuart

Re: Why is scaling HBase much simpler then scaling a relational db?

Reply via email to