On Mon, Jul 8, 2019 at 10:01 PM Jehan-Guillaume de Rorthais <[email protected]> wrote:
> I should have step up to this thread, sorry :) > Really appreciate all the assistance so far. > The real problem is not how much xact you will lost during failover, but > how we > can choose the best standby to elect. This election needs the timeline and > LSN > location of all standbys. And today, to fetch te timeline, we must issue a > CHECKPOINT, then read the controldata file. > > I dig in xlog.c today. Maybe I can write a small extension to get the > timeline > from shared memory directly and make pgsqlms use it if it detects it. So > people > can decide if they feel like it is too invasive or really needed for > their usecase. Maybe in next release. What do you think? Would it be > useful to > you? > Yes, that would be a really useful addition IMO. I would definitely use it. If we can avoid taking a checkpoint that will save precious minutes during a failover and the risk of timeouts would be drastically reduced. Would be happy to test it if you want! > > > > > I managed to improve the average time checkpoints are taking already from > > what I mentioned in that thread, mainly by decreasing checkpoint_timeout > > and setting full_page_writes = off; ostensibly not necessary on ZFS. > > The "full_page_writes" helps lowering the amount of WAL produced. Not the > amount of writes to sync during the checkpoint. But I am sure it helps for > your > performances :) > If I'm saturating the IO capacity of my system during a forced checkpoint and full_page_writes = off reduces IO by reducing the amount of WAL, then it should help in an indirect way? > > Lowering "checkpoint_timeout" probably helps. As checkpoints occur more > frequently, there is statistically less data to sync when a forced > checkpoint > happen during a failover. > > Regards, > >
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
