[MariaDB developers] Re: Update on MDEV-34705 implementing binlog in InnoDB

Markus Mäkelä via developers Thu, 05 Dec 2024 08:28:45 -0800

Hi,

On 12/5/24 18:02, Kristian Nielsen wrote:

What about the following idea?


1. Implement BEFORE_WRITE semi-sync mode. The master will not write
    transactions to the binlog until at least one slave have acknowledged.

2. This means that if the master crashes, when it comes back up it will have
    no transaction that does not exists on at least one running node
    (assuming at most a single failure at a time).

3. When the master restarts, it will go into read-only mode and wait for
    MaxScale (or other management system) to tell it what to do, similar to
    MDEV-34878.

4. If MaxScale decides to keep it as the master, it will briefly set it up
    as a slave and make sure it has replicated the latest GTID on any slave
    in the replication topology. Then it will be set read-write and continue
    as the master.

5. If MaxScale decides to promote another server as the new master, the old
    master is kept in read-only mode and configured as a slave. The
    BEFORE_WRITE ensures the old master will not be ahead of the new master.

This requires the ability in MaxScale to do (4).

I think this will be much more robust than having a crashed server try to
remove transactions already written to the binlog, and having to configure
the server to have one or another role when it starts up.

Instead, all servers in the replication topology always wait at startup for
the manager to replicate any missing transactions from the appropriate
server, and then either set it read-write as a master or continue as a
slave.

What do you think? Of course, this is all for the future, it requires
implementing BEFORE_WRITE in the server first. But I think it sounds
promising.

I think that sounds like a good idea. In step 4, instead of brieflyreplicating the lost changes and resuming writes on the same node, Ithink MaxScale could just move all writes to the node with the newestGTID and turn off read-only there, essentially performing a switchoverto another node. I think that it might actually already handle thiscase as it can happen with AFTER_SYNC.

However, I'd imagine that this BEFORE_WRITE mode might not be superuseful for manually managed replication. You'd have to always switchover to another node when a server crashes. All in all, the BEFORE_WRITEsounds promising and we'd definitely appreciate it but also doesn't seemsuper useful outside of this somewhat niche use-case. However I do stillthink semi-sync is generally useful and thus this does seem likesomething that, as you said, should be implemented eventually in thebinlog-in-engine mode.

I'm looking forward to see more progress updates on this, it all seemsvery interesting.


Markus

--
Markus Mäkelä, Senior Software Engineer
MariaDB Corporation

_______________________________________________
developers mailing list -- developers@lists.mariadb.org
To unsubscribe send an email to developers-le...@lists.mariadb.org

[MariaDB developers] Re: Update on MDEV-34705 implementing binlog in InnoDB

Reply via email to