On Jul 24, 2009, at 11:29 PM, Michael Izioumtchenko wrote:
Alex Yurchenko wrote:
On Fri, 24 Jul 2009 10:56:20 +0200, Paul McCullagh
<[email protected]> wrote:
On Jul 23, 2009, at 3:15 PM, Stewart Smith wrote:
On Tue, Jul 21, 2009 at 09:28:54PM -0700, MARK CALLAGHAN wrote:
How is the serial log to be kept in sync with a storage engine
given
the Applier interface? MySQL uses two phase commit, but the
Applier
interface has one method, ::apply(). In addition to methods for
performing 2PC, keeping a storage engine and the serial log in
sync
requires additional methods for crash recovery to support commit
or
rollback of transactions in state PREPARED in the storage engine
depending on the outcome recorded in the serial log.
The bit that keeps banging in my head in regards to this is
storing it
in the same engine as part of the transaction and so avoiding 2pc.
We discussed this on Drizzle Day, and that was my recommendation.
This would mean, after a transaction has committed, the
replication system asks the engine for a "list of operations"
that were performed by the transaction.
I welcome the idea of the meaningful conversation between the
replication system
and the engine. Mark's crash recovery challenge could probably be
solved by asking the engine
to store a little piece of data in the redo log, then in the course
of normal engine crash recovery
the engine will report it back to replication so the replication
will know what exactly to replay.
One thing I'm confident that can not be solved without it is
addressing the problem where the application
'optimizes' redo log flushing as what is done with
innodb_flush_log_at_trx_commit.
otoh I hope that 'ask for a list of operations after the commit' is
just an algorithm description,
not the actual implementation. I think the communication could be
made into something more simultaneous
especially for the regime where the engine is normally asked to do
row by row operations.
No, 'ask for a list of operations after the commit' is just a high
level description of what should happen.
On way to implement this is suggested by Brian:
On Jul 24, 2009, at 8:21 PM, Brian Aker wrote:
To make this happen... you would need to skip the trigger for the
replication, not hard, and then have a separate thread convert the
log to protos and push it into the replication storage piece.
Using this method we are guaranteed never to loose a transaction
during replication, and there is no need to use 2 phase commit (as
long as only one storage engine is involved in a transaction).
--
Paul McCullagh
PrimeBase Technologies
www.primebase.org
www.blobstreaming.org
pbxt.blogspot.com
_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help : https://help.launchpad.net/ListHelp