Re: [HACKERS] Support for N synchronous standby servers - take 2

Kyotaro HORIGUCHI Tue, 05 Apr 2016 03:18:26 -0700

At Tue, 5 Apr 2016 18:08:20 +0900, Fujii Masao <masao.fu...@gmail.com> wrote in 
<CAHGQGwG+DM=lcctg6q_uxkgk17cblkrhbwtpfun3+hu9qbv...@mail.gmail.com>
> On Mon, Apr 4, 2016 at 10:00 PM, Masahiko Sawada <sawada.m...@gmail.com> 
> wrote:
> > On Mon, Apr 4, 2016 at 6:03 PM, Kyotaro HORIGUCHI
> > <horiguchi.kyot...@lab.ntt.co.jp> wrote:
> >> Hello, thank you for testing.
> >>
> >> At Sat, 2 Apr 2016 14:20:55 +1300, Thomas Munro 
> >> <thomas.mu...@enterprisedb.com> wrote in 
> >> <CAEepm=2sdDL2hs3XbWb5FORegNHBObLJ-8C2=aaeg-riztd...@mail.gmail.com>
> >>> > One thing I noticed is that there are LOG messages telling me when a
> >>> > standby becomes a synchronous standby, but nothing to tell me if a
> >>> > standby stops being a standby (ie because a higher priority one has
> >>> > taken its place in the quorum).  Would that be interesting?
> >>
> >> A walsender exits by proc_exit() for any operational
> >> termination so wrapping proc_exit() should work. (Attached file 1)
> >>
> >> For the setting "2(Sby1, Sby2, Sby3)", the master says that all
> >> of the standbys are sync-standbys and no message is emited on
> >> failure of Sby1, which should cause a promotion of Sby3.
> >>
> >>>  standby "Sby3" is now the synchronous standby with priority 3
> >>>  standby "Sby2" is now the synchronous standby with priority 2
> >>>  standby "Sby1" is now the synchronous standby with priority 1
> >> ..<Sby 1 failure>
> >>>  standby "Sby3" is now the synchronous standby with priority 3
> >>
> >> Sby3 becomes sync standby twice:p
> >>
> >> This was a behavior taken over from the single-sync-rep era but
> >> it should be confusing for the new sync-rep selection mechanism.
> >> The second attached diff makes this as the following.
...
> >> Applying the both of the above patches, the message would be like
> >> the following.
> >>
> >>> 17:54:08.367 LOG:  standby "Sby3" is now a synchronous standby with 
> >>> priority 3
> >>> 17:54:08.564 LOG:  standby "Sby1" is now a synchronous standby with 
> >>> priority 1
> >>> 17:54:08.565 LOG:  standby "Sby2" is now a synchronous standby with 
> >>> priority 2
> >>> 17:54:18.387 LOG:  standby "Sby3" is now a potential synchronous standby 
> >>> with priority 3
> >>> 17:54:28.887 LOG:  synchronous standby "Sby1" with priority 1 exited
> >>> 17:54:31.359 LOG:  standby "Sby3" is now a synchronous standby with 
> >>> priority 3
> >>> 17:54:39.008 LOG:  standby "Sby1" is now a synchronous standby with 
> >>> priority 1
> >>> 17:54:41.382 LOG:  standby "Sby3" is now a potential synchronous standby 
> >>> with priority 3
> >>
> >> Does this make sense?
> >>
> >> By the way, Sawada-san, you have changed the parentheses for the
> >> priority method from '[]' to '()'. And I mistankenly defined
> >> s_s_names as '2[Sby1, Sby2, Sby3]' and got wrong behavior, that
> >> is, only Sby2 is registed as mandatory synchronous standby.
> >>
> >> For this case, the tree members of SyncRepConfig are '2[Sby1,',
> >> 'Sby2', "Sby3]'. This syntax is valid for the current
> >> specification but will surely get different meaning by the future
> >> changes. We should refuse this known-to-be-wrong-in-future syntax
> >> from now.
> >>
> >
> > I have no objection about current version patch.
> > But one optimise idea I came up with is to return false before
> > calculation of lowest LSN from sync standby if MyWalSnd is not listed
> > in sync_standby.
> > For example in SyncRepGetOldestSyncRecPtr(),
> >
> > ==
> > sync_standby = SyncRepGetSyncStandbys();
> >
> > if (list_length(sync_standbys) <SyncRepConfig->num_sync()
> > {
> >   (snip)
> > }
> >
> > /* Here if MyWalSnd is not listed in sync_standby, quick exit. */
> > if (list_member_int(sync_standbys, MyWalSnd->slotno))
> > return false;
> 
> list_member_int() performs the loop internally. So I'm not sure how much
> adding extra list_member_int() here can optimize this processing.
> Another idea is to make SyncRepGetSyncStandby() check whether I'm sync
> standby or not. In this idea, without adding extra loop, we can exit earilier
> in the case where I'm not a sync standby. Does this make sense?


The list_member_int() is also performed in the "(snip)" part. So
SyncRepGetSyncStandbys() returning am_sync seems making sense.

sync_standbys = SyncRepGetSyncStandbys(am_sync);

/*
 *  Quick exit if I am not synchronous or there's not
 *  enough synchronous standbys
 * /
if (!*am_sync || list_length(sync_standbys) < SyncRepConfig->num_sync)
{
  list_free(sync_standbys);
  return false;
}


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for N synchronous standby servers - take 2

Reply via email to