On Sun, Jun 16, 2019 at 7:30 PM Stephen Frost <sfr...@snowman.net> wrote:
> Ok, so you want fewer checkpoints because you expect to failover to a > replica rather than recover the primary on a failure. If you're doing > synchronous replication, then that certainly makes sense. If you > aren't, then you're deciding that you're alright with losing some number > of writes by failing over rather than recovering the primary, which can > also be acceptable but it's certainly much more questionable. > Yes, in our setup that's the case: a few lost transactions will have a negligible impact to the business. > I'm getting the feeling that your replicas are async, but it sounds like > you'd be better off with having at least one sync replica, so that you > can flip to it quickly. They are indeed async, we traded durability for performance here, because we can accept some lost transactions. > Alternatively, having a way to more easily make > the primary to accepting new writes, flush everything to the replicas, > report that it's completed doing so, to allow you to promote a replica > without losing anything, and *then* go through the process on the > primary of doing a checkpoint, would be kind of nice. > I suppose that would require being able to demote a master to a slave during runtime. That would definitely be nice-to-have. > > > Thanks, > > Stephen >