Re: replay pause vs. standby promotion

Fujii Masao Mon, 23 Mar 2020 07:37:47 -0700



On 2020/03/23 22:46, Sergei Kornilov wrote:

Hello

(I am trying to find an opportunity to review this patch...)


Thanks for the review! It's really helpful!

Consider test case with streaming replication:

on primary: create table foo (i int);
on standby:

postgres=# select pg_wal_replay_pause();
  pg_wal_replay_pause
---------------------

(1 row)


postgres=# select pg_is_wal_replay_paused();
  pg_is_wal_replay_paused
-------------------------
  t
(1 row)

postgres=# table foo;
  i
---
(0 rows)

Execute "insert into foo values (1);" on primary

postgres=# select pg_promote ();
  pg_promote
------------
  t
(1 row)

postgres=# table foo;
  i
---
  1

And we did replay one additional change during promote. I think this is wrong 
behavior. Possible can be fixed by

+    if (PromoteIsTriggered()) break;
     /* Setup error traceback support for ereport() */
     errcallback.callback = rm_redo_error_callback;


You meant that the promotion request should cause the recovery
to finish immediately even if there are still outstanding WAL records,
and cause the standby to become the master? I don't think that
it's the expected (also existing) behavior of the promotion. That is,
the promotion request should cause the recovery to replay as much
WAL records as possible, to the end, in order to avoid data loss. No?

If we would like to have the promotion method to finish recovery
immediately, IMO we should implement something like
"pg_ctl promote -m fast". That is, we need to add new method into
the promotion.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Re: replay pause vs. standby promotion

Reply via email to