Re: [Drizzle-discuss] Understanding MySQL Internals

Tom Hanlon Mon, 15 Mar 2010 22:26:07 -0700

Rehan,

On 16 Mar 2010, at 00:15, Rehan Iftikhar wrote:

Hi
I was wondering how relevant the content in Understanding MySQLInternals (http://www.amazon.com/Understanding-MySQL-Internals-Sasha-Pachev/dp/0596009577) is to drizzle?

Sasha Pachev's book on mySQL internals would be significantly morerelevant to Drizzle than a book on Oracle internals.

The Core Drizzzle team are likely going to clam that the two are verydifferent, but they have spent the last year making it different, andworking on the differences. So the similarities might be somewhatinvisible to them. So I will take a shot at an overview and they cancorrect me as needed.

I would say that until the Drizzle internals book is written thatbook would be the closest. I think most would agree that thedifferences are significant, but still the core of drizzle is aderived work from MySQL. In addition the storage engine concept andcompatibility with the major MySQL storage engines such as innodb isin drizzle as well.

I can not find my copy of Sasha Pachev's book or I would go through itand tell you how well it covers the concepts that are similar.

The Drizzle team can add some details but as far as I can tell hereare some things that are different and some things that are the same.


Authentication:
        Drizzle is  plugin based PAM and http_auth and others.

MySQL built in authentication of user,host, password at the DB, tableand column level. Stored in a database table


Thread management:

I assume that this is similar. MySQL until very recently had a singlemulti threaded process. A thread was allocated per connection. Thatthread might be cached when the user disconnected and re-used for anincoming connection. In recent versions a "pool of threads"optimization has been added where a pool of threads is allocated foruser connections and those threads are used as needed. I am not surewhat code base drizzle started with and how stable pool of threads wasanyhow, so what drizzle uses.. I do not know.


Parser:

I have not heard much chatter about the drizzle parser so I assume itis derived from the MySQL parser. I imagine MySQL dual license modelcaused the MySQL parser to not reuse some open source parser librariesso perhaps the code has been cleaned up.. but I am only guessing.


Optimizer:

I have not heard much chatter here either, so I assume that theoptimizer is derived from the MySQL optimizer as well. The drizzleteam can correct me if I am wrong.


Replication:

MySQL relied upon what had been a statement based binary log. Meaningthat if a statement might have changed data, it was written to a logfile at the SQL layer and the slave would replay the statements.Features were added in MySQL 5.1 to instead of logging the statementsthat may have changed or added rows we instead ask the storage enginefor copies of the changes and place those in a "row" based replicationlog the binary log. Statement based was still supported and it was/ismessy. Statement based had some issues, but row based added someissues and some confusion and some bugs.Drizzle, tore all of that out and implemented replication capabilitybased upon google protocol buffers. http://code.google.com/apis/protocolbuffers/docs/overview.htmlJay covers the internals fairly well in a series of blog posts here..http://www.joinfu.com/2009/10/drizzle-replication-changes-in-api-to-support-group-commit/

It is fair to say that there are significant differences between thetwo systems regarding replication.


Transaction stuff:

mySQL at the core was not a transactional database, it was made towork somehow with transactional storage engines. I sometimes wouldthink of the MySQL server or the SQL layer as coordinating a grouptransaction to the underlying storage engines. The relationship wascomplicated, and it complicated replication somewhat in ways that area little complicated to go into, but in a purely transactional systemthe same log and system that is used for transactional consistency anddurability can typical be used to assist the replication process. I amnot sure what Drizzles statement of intent regarding transactions is.But it is important to note that the MySQL way.. led to a somewhatmessy implementation and it seems that Drizzle is hard at work evenlately in terms of cleaning that up.

It seems that any Database that allows plugins for Storage engines isgoing to have to hand off the Durability Requirements to the storageengines so in a rough outline things are somewhat similar, but thedifferences will be many.

I could picture Drizzle be more transactional and still allowing thestorage engines to ignore the transactional stuff. Whereas MySQL wasnot transaction and forced the storage engines to do extra work inorder to be transactional.

If you are looking for an understanding of the Relational Model andhow SQL is optimized and how Joins are performed, I found Dan Tow'sbook SQL tuning http://www.amazon.com/SQL-Tuning-Dan-Tow/dp/0596005733to be helpful. It goes through the concepts of indexes and joinsreally well.

If I find my copy of Sasha Paschev's book I can give you a betterreview.




--
Tom  Hanlon

--
-Rehan
_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp



_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Re: [Drizzle-discuss] Understanding MySQL Internals

Reply via email to