OK. Going to try this again. After reading through these emails I think I have learned a little more about the way you are thinking. I DO NOT want to start some kind of flame war. However, I disagree very strongly with what you are saying. Yes, you are right, sharding does require more complexity from the application layer. Sorry for all you developers out there (and I can safely say that I am NOT a developer!!). The fundamental issue for you, as I see it, is the increased complexity caused by sharding the application.

That being said, I will say this...if you develop on some other RDBMS such as MS or Oracle is it possible to deleveop something like you are saying...an all-inclusive database that isn't "sharded"? Yep, when I worked at Netzero in 2001 for example we had two database servers running Oracle, one on the east coast in Virginia and one one the west coast in California. The east coast server was a backup of the west coast server. So one database server did the billing for all of Netzero's customers. Millions of customers..absolutely. All in one nice tidy box that I am sure was easier to develop the billing applications around.

Here is the kicker. Each box was a top of the line Sun server that had 32 processors and 32 gigs of RAM. They could handle up to 64 procs and 64 gigs. And each cost well over a million dollars for the hardware alone. Running Oracle on it must have cost over 100,000 dollars for software licenses. Granted this was in 2001, but the licensing cost for Oracle haven't gone down any that I am aware of...and the hardware cost will still be quite steep to do this type of thing.
So I ask you this..

Would it be better to go with that scenario or something like this:

Implement the billing application using MySQL. Shard it. Create complexity. Your hardware cost saving alone will pay for multiple developers to handle any complexity increases. Any decent DBA is going to be able to handle multiple servers required to operate this setup. You will probably see a decrease in salary cost moving from Oracle to MySQL dbas. So for the bottom line of the company it is a overall win by far. It is only the inherent difficulty in moving complex systems from one type of DB to another that keep more companies from switching. Why hasn't this happend previously?? Because until version 4 of MySQL was stable there were many features not available in MySQL that were needed by these types of systems.

It is my contention that as the clustering capabilities of MySQL continue to grow and mature (think of when version 6.0 goes stable) companies will move to MySQL in droves. THEN you have the ability to build a single "virtual" database (at least from the point of view of your application) that will scale simply and elegantly. As I said in the previous email it is only that 5.1 is in beta that keeps this from being available now. And many companies, such as Kaneva, are doing this right now. The only reason that companies like Digg and Flikr can exist and grow at such phenomenal rates is that they keep the cost of the development of the system to a minimum and the overhead of operating (licensing costs and hardware cost) down as low as possible. In addition, of course, they need the ability to scale out very quickly. Digg didn't get any significant funding until just recently. And yet they epitomize the web 2.0 companies. They did it by both keeping their cost down and having the ability to grow quickly. Couldn't have done it with Oracle or MS.
Just my thoughts :)

Keith







Naz Gassiep wrote:
Wow.
    The problem with sharding I have is the large amount of code
required in the app to make it work. IMHO the app should be agnostic to
the underlying database system (by that I don't mean the DB in use such
as MySQL or whatever or the schema, I mean the way the DB has been
deployed) so that changes can be made to it without having to worry
about impacting app code. This is one of my fundamental design imperatives.

    Then again, I'm not a regular MySQL user so I don't know what is and
is not the norm in the MySQL world.

- Naz.

Evaldas Imbrasas wrote:
You certainly have a right to disagree, but pretty much every
scalability talk at the MySQL conference a few weeks ago was focused
on data partitioning and sharding. And those talks very given by folks
working for some of the most popular (top 100) websites in the world.
It certainly looks like data partitioning is the way to go in the
MySQL world at this point, probably at least until production-ready
and feature-full MySQL Cluster is out. And even then large percentage
of dotcom companies would use data partitioning instead since it can
be implemented on commodity hardware.

Once again, we're talking *really* big websites using MySQL (not
Oracle or SQL Server or whatever) here. Most websites won't ever need
to partition their production databases, and different RDMS might have
different approaches for scalability.


On 5/24/07, Naz Gassiep <[EMAIL PROTECTED]> wrote:
Data partitioning? Sorry, I disagree that partitioning a table into more
and more servers is the way to scale properly. Perhaps putting
databases' tables onto different servers with different hardware
designed to meat different usage patterns is a good idea, but data
partitioning was a very short lived idea in the world of databases and
I'm glad that as an idea it is dying in practice.



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to