Re: [Drizzle-discuss] questions about replication and Drizzle

Jay Pipes Tue, 21 Jul 2009 22:18:38 -0700

MARK CALLAGHAN wrote:

Jay,


Kudos on writing code that is easy to read. I have some questions
about the new interfaces.


Hi!  And, I'm happy to answer questions! :)

When serial_event_log.cc is used,

Quick FYI, we've changed the name of the plugin from "serial event log"to "command log", as the log is not serialized (in a transactional sense)...


> the Applier implementation writes

Commands to a file. Are there plans to provide an interface to supply
Command instances for a slave to replay?

Yep, absolutely, and this interface will be debuted late this week orearly next week. Two additional plugins, called Publisher andSubscriber (loosely modeled after MySQL's Master and Slave concepts) arecoming which communicate the state of a server and pass streams ofCommand messages from the publisher to the subscriber.

How is the serial log to be kept in sync with a storage engine given
the Applier interface? MySQL uses two phase commit, but the Applier

interface has one method, ::apply().

In short, I don't know yet. I've found that as I develop the commandlog plugin and the subscriber/publisher plugins, that I've gone throughan iterative change process, and things tend to change quickly. Iimagine that the first round of plugins and interfaces will be filteredand inspected by folks such as yourself and Robert Hodges to work outthe kinks.

What would be most useful is if you could, in pseudo-code, describe thekind of interface a potential developer of a semi-sync replicationplugin would *want* (or need) to use. Think about the kind ofinformation that the Command message would need to contain. PaulMcCullagh and I discussed in April adding a member to theTransactionContext sub-message called "engine_trx_id", which wouldcontain a storage-engine-specific identifier (LSN, transaction id, etc,etc) that could be passed through the replication API in order for thesame storage engine on different nodes to coordinate with each other.

If this member is added to the Command, then technically the command logitself would not be necessary at all. Storage engines could simplyreplicate commands directly between each other, communicating in themost efficient method possible using their own internal APIs andbypassing any "double logging".


> In addition to methods for

performing 2PC, keeping a storage engine and the serial log in sync
requires additional methods for crash recovery to support commit or
rollback of transactions in state PREPARED in the storage engine
depending on the outcome recorded in the serial log.

Understood, but I believe this is an implementation detail that asynchronous replication plugin should deal with, not necessarily thekernel...again, let's focus on figuring out how the API (and theinformation contained in the Command message) would need to change inorder for such a plugin to be able to do its work.


Very much interested in your input and feedback,

Jay





_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Re: [Drizzle-discuss] questions about replication and Drizzle

Reply via email to