Koa McCullough wrote: > Mats, > > Just a few thoughts. > > 1. Correct me if I'm wrong but, isn't replication going to be a module? > It makes me kind of nervous to hear about changes that will tie dirzzle > to one form of replication, what ever format that might be.
As a general principle, I prefer to modularize code as far as possible. It is not that way currently with the MySQL code base, and this is something I've been working on to try to change. I will not "hard code" replication into the server; however, having said that... > 2. There are some very good reasons to use statement based replication. > -Users do dumb things from time to time and it is nice to be able to > restore a db w/ a known position, parse the output from mysqlbinlog to > remove a given transaction, and roll the database forward. I know this > doesn't work in all situations but, it has been a very good "undo" > button for us in the past. That is indeed a case, but I feel that the ability to edit the statements is just a way to avoid building a proper tools support. Considering all the problems associated with statement-based replication, I don't see statement-based replication as an advantage. Statement-based replication gives: - Auditing, but this can be handled, e.g., by allowing comments in the replication log. - Editing or removing statement, but this can be handled by sending the "corrected" statement (if any) using a normal client interface and the start replication from just after the offending statement/transaction. AFAICT, all of the "editing" cases I can envision can be solved by a proper API to the replication log (but if you have some cases, real or imaginary, that you think can pose a problem I would be happy to hear them). The format I am considering is basically a combined redo-undo log. This would mean that it would be possible to run the log both backwards and forwards, so the situation you describe could easily be solved by running the log backwards and then start replication from a different position, effectively skipping a set of events. The advantage of a data-oriented (a.k.a., row-based) format is that the data can be distributed over several threads without regard what statements they come from, allowing higher efficiency. It also allows statements to be applied at the same time on both the master and the slave (by sending the rows as they are applied), reducing the latency of the commit on the master to the commit on the slave to basically be just the network latency. As tools support, I am considering libraries to allow "playing" replication logs in a controlled manner using scripts, with scripting language either being Lua or Perl (or maybe both, it depends on where I put the scripting support). > -The performance gain you are talking about applies to the number of > seconds behind the db is but, in a situation where there is high latency > and the db is for disaster recovery I would rather have a faster relay > log. There has to be a buffer of some kind on the slave to handle a large inflow of changes, so there will always be something similar to a relay log. This means that for disaster recovery, the data will be at the slave potentially before it is actually written on the master. > I am assuming that the "raw data" format you are talking about > would be a row based and probably larger than the tradition statement > based system. Yes, it would be something like a row-based format, but the row-based format is not always larger than statement-based format. It depends entirely on the queries. For example, INSERT statements are usually smaller. > I guess I would rather see drizzle stay flexible where replication is > concerned. I'm all for making a quicker executing replication the > default. What do you think? I want to make replication fast, compact, simple, and flexible; but above all I want to keep our options open (after all, we cannot read the future). Just my few cents, Mats Kindahl > > Just my take on things. > > --Koa > > On Sun, 2008-08-03 at 21:32 +0200, Mats Kindahl wrote: >> Hi! >> >> Long-term, I would like to see the table definitions stored in >> (transactional) tables. That would mean that: >> >> - DDL operations can be replicated as tables changes, even in row format >> - DDL operations can be transactional (provided the tables holding the >> transactional data is transactional) >> >> Since we are aiming for a micro-server that can be a component of a >> cloud, I think that we, long-term, have to move to a raw data format for >> replication: the current replication system lacks in performance and it >> would be beneficial to be able to treat all changes that need to >> replicate in the same manner. This simplifies the logic, hence shrinks >> the code base and eliminates the number of code paths, hence improves >> performance of the replication (and of the system in general). >> >> Just my few cents, >> Mats Kindahl >> >> Brian Aker wrote: >>> Hi! >>> >>> 1) Engines should be asked about tables, and tables belong to engines. >>> 2) We still need some sort of FRM for Engines which have no ability to >>> handle their own data dictionaries. >>> >>> The current Unireg format is a real pain in the butt and has not been >>> scaling well with recent additions. It pre-set binary type. >>> >>> Three methods for storage: >>> 1) XML (slow) >>> 2) SQL (tamper-able/less slow) >>> 3) Binary Stream >>> >>> Personally I am for types 2 or 3 for a serial method. In fact I am for >>> both being added to the table class. >>> >>> For the binary type though I see a few choices: >>> 1) Google Protocols (not in distributions, beta) >>> 2) Science Types/ASN (also not in distributions) >>> 3) Envelope style (you have to roll your own, but cake to do (think ID3)). >>> >>> Thoughts? Additional ones? >>> >>> I lean toward #3 only because I fear the hassle factor in the above ones >>> (it is also simple and easy to understand). I am hoping we can get our >>> hands into Innodb and just use its for tables it has. >>> >>> Oh, and why do engines need to know? If an engine is not available we do >>> not want to know about its tables, and if an engine is network based >>> things get a lot more complicated. >>> >>> BTW for methods think: >>> >>> tbl = new Table(binary/char array); >>> >>> and: >>> >>> char *ptr= table->serialze(); >>> >>> or >>> char *ptr= table->create_table_string(); >>> >>> Cheers, >>> -Brian >>> >>> -- >>> _______________________________________________________ >>> Brian "Krow" Aker, brian at tangent.org >>> Seattle, Washington >>> http://krow.net/ <-- Me >>> http://tangent.org/ <-- Software >>> _______________________________________________________ >>> You can't grep a dead tree. >>> >>> >>> >>> >>> _______________________________________________ >>> Mailing list: https://launchpad.net/~drizzle-discuss >>> Post to : [email protected] >>> Unsubscribe : https://launchpad.net/~drizzle-discuss >>> More help : https://help.launchpad.net/ListHelp >> >> _______________________________________________ >> Mailing list: https://launchpad.net/~drizzle-discuss >> Post to : [email protected] >> Unsubscribe : https://launchpad.net/~drizzle-discuss >> More help : https://help.launchpad.net/ListHelp > -- Mats Kindahl Lead Software Developer Replication Team MySQL AB, www.mysql.com
begin:vcard fn:Mats Kindahl n:Kindahl;Mats org:Sun Microsystems adr;quoted-printable:;;Tegv=C3=A4gen 3;Storvreta;SE;74334;Sweden email;internet:[EMAIL PROTECTED] title:Lead Replication Software Developer x-mozilla-html:FALSE version:2.1 end:vcard
_______________________________________________ Mailing list: https://launchpad.net/~drizzle-discuss Post to : [email protected] Unsubscribe : https://launchpad.net/~drizzle-discuss More help : https://help.launchpad.net/ListHelp

