Re: Why is scaling HBase much simpler then scaling a relational db?

2008-08-28 Thread Mork0075
Thank you very much for your effort! So it really depends on what you want to use it for. If you're thinking about it, you probably have some kind of scale issues. Not at the moment. Actually our software runs on a single server, web server/database/file storage/lucene side by side. But we're

Re: Why is scaling HBase much simpler then scaling a relational db?

2008-08-27 Thread Edward J. Yoon
Hi, Planet-scale data explorations and data mining operations will almost always need to include some sequential scans. Then, How can we speed up sequential scans? BigTable paper shows that. * Column-oriented storage (it reduces I/O) * Data compression * PDP (parallel distributed processing)

RE: Why is scaling HBase much simpler then scaling a relational db?

2008-08-27 Thread Jonathan Gray
Discussion inline. You example with the friends makes perfectly sense. Can you imagine a scenario where storing the data in column oriented instead of row oriented db (so if you will an counterexample) causes such a huge performance mismatch, like the friends one in row/column comparison?

Re: Why is scaling HBase much simpler then scaling a relational db?

2008-08-21 Thread Mork0075
Thank you, but i still don't got it. I've read tons of websites and papers, but there's no clear und founded answer why use BigTable instead of relational databases. MySQL Cluster seams to offer the same scalabilty and level of abstraction, whithout switching to a non relational pardigm.

Re: Why is scaling HBase much simpler then scaling a relational db?

2008-08-21 Thread Fernando Padilla
I'm no expert, but maybe I can explain it the way I see it, maybe it will resonate with other newbies like me :) Sorry if it's long winded, or boring for those who already know all this. BigTable and Hadoop are inherently sharded and distributed. They are architected to store the data in

RE: Why is scaling HBase much simpler then scaling a relational db?

2008-08-21 Thread Jonathan Gray
A few very big differences... - HBase/BigTable don't have transactions in the same way that a relational database does. While it is possible (and was just recently implemented for HBase, see HBASE-669) it is not at the core of this design. A major bottleneck of distributed multi-master

Re: Why is scaling HBase much simpler then scaling a relational db?

2008-08-21 Thread Mork0075
Thanks a lot for all replies, this is really helpful. As you describe it, its a problem of implementation. BigTable is designed to scale, there are routines to shard the data, desitribute it to the pool of connected servers. Could MySQL perhaps decide tomorrow to implement something similar

Re: Why is scaling HBase much simpler then scaling a relational db?

2008-08-20 Thread Stuart Sierra
On Tue, Aug 19, 2008 at 9:44 AM, Mork0075 [EMAIL PROTECTED] wrote: Can you please explain, why someone should use HBase for horizontal scaling instead of a relational database? One reason for me would be, that i don't have to implement the sharding logic myself. Are there other? A slight

RE: Why is scaling HBase much simpler then scaling a relational db?

2008-08-20 Thread Jim Kellerman
Stuart, In general you will get a quicker response to HBase questions by posting them to the HBase mailing list ([EMAIL PROTECTED]) see http://hadoop.apache.org/hbase/mailing_lists.html for how to subscribe. Perhaps the best document on scaling HBase is actually the Bigtable paper:

Re: Why is scaling HBase much simpler then scaling a relational db?

2008-08-19 Thread Mork0075
Thanks, this was really informativ :) Bigtable uses both. First it splits row ranges based on size. It also has the ability to detect hot row ranges and will split a region if it becomes too hot. This is tricky because you don't want to have a hot range split off and then have it drop below

Re: Why is scaling HBase much simpler then scaling a relational db?

2008-08-18 Thread Mork0075
I've read some papers and tutorials this week and now got some conrete questions: (1) Sharding is also available in common relational systems. Often it is discribed that you need an application layer for the (shards) federation. I unterstand HBase like this layer, which implements the whole

RE: Why is scaling HBase much simpler then scaling a relational db?

2008-08-18 Thread Jim Kellerman
Please note that you will get a prompt response about HBase questions if you ask them on the HBase user list ( [EMAIL PROTECTED] ) -Original Message- From: Mork0075 [mailto:[EMAIL PROTECTED] Sent: Sunday, August 17, 2008 11:51 PM To: core-user@hadoop.apache.org Subject: Re: Why is

Re: Why is scaling HBase much simpler then scaling a relational db?

2008-08-07 Thread Steve Loughran
Mork0075 wrote: Hello, can someone please explain oder point me to some documentation or papers, where i can read well proven facts, why scaling a relational db is so hard and scaling a document oriented db isnt? http://labs.google.com/papers/bigtable.html relational dbs are great for