Re: [HACKERS] Support for N synchronous standby servers - take 2

Fujii Masao Thu, 10 Mar 2016 02:22:02 -0800

On Fri, Mar 4, 2016 at 3:40 AM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Thu, Mar 3, 2016 at 11:30 PM, Masahiko Sawada <sawada.m...@gmail.com> 
> wrote:
>> Hi,
>>
>> Thank you so much for reviewing this patch!
>>
>> All review comments regarding document and comment are fixed.
>> Attached latest v14 patch.
>>
>>> This accepts 'abc^Id' as a name, which is wrong behavior (but
>>> such appliction names are not allowed anyway. If you assume so,
>>> I'd like to see a comment for that.).
>>
>> 'abc^Id' is accepted as application_name, no?
>> postgres(1)=# set application_name to 'abc^Id';
>> SET
>> postgres(1)=# show application_name ;
>>  application_name
>> ------------------
>>  abc^Id
>> (1 row)
>>
>>> addlit_xd_string(char *ytext) and addlitchar_xd_string(unsigned
>>> char ychar) requires differnt character types. Is there any reason
>>> for that?
>>
>> Because addlit_xd_string() is for adding string(char *) to xd_string,
>> OTOH addlit_xd_char() is for adding just one character to xd_string.
>>
>>> I personally don't like addlit*string() things for such simple
>>> syntax but itself is acceptble enough for me. However it uses
>>> StringInfo to hold double-quoted names, which pallocs 1024 bytes
>>> of memory chunk for every double-quoted name. The chunks are
>>> finally stacked up left uncollected until the current
>>> memorycontext is deleted or reset (It is deleted just after
>>> finishing config file processing). Addition to that, setting
>>> s_s_names runs the parser twice. It seems to me too greedy and
>>> seems that static char [NAMEDATALEN] is enough using the v12 way
>>> without palloc/repalloc.
>>
>> I though that length of group name could be more than NAMEDATALEN, so
>> I use StringInfo.
>> Is it not necessary?
>>
>>> I found that the name SyncGroupName.wait_num is not
>>> instinctive. How about sync_num, sync_member_num or
>>> sync_standby_num? If the last is preferable, .members also should
>>> be .standbys .
>>
>> Thanks, sync_num is preferable to me.
>>
>> ===
>>> I am quite uncomfortable with the existence of
>>> WanSnd.sync_standby_priority. It represented the pirority in the
>>> old linear s_s_names format but nested groups or even
>>> single-level quarum list obviously doesn't fit it. Can we get rid
>>> of sync_standby_priority, even though we realize atmost
>>> n-priority for now?
>>
>> We could get rid of sync_standby_priority.
>> But if so, we will not be able to see the next sync standby in
>> pg_stat_replication system view.
>> Regarding each node priority, I was thinking that standbys in quorum
>> list have same priority, and in nested group each standbys are given
>> the priority starting from 1.
>>
>> ===
>>> The function SyncRepGetSyncedLsnsUsingPriority doesn't seem to
>>> have specific code for every prioritizing method (which are
>>> priority, quorum, nested and so). Is there any reson to use it as
>>> a callback of SyncGroupNode?
>>
>> The reason why the current code is so is that current code is for only
>> priority method supporting.
>> At first version of this feature, I'd like to implement it more simple.
>>
>> Aside from this, of course I'm planning to have specific code for nested 
>> design.
>> - The group can have some name nodes or group nodes.
>> - The group can use either 2 types of method: priority or quorum.
>> - The group has SyncRepGetSyncedLsnFn() and SyncRepGetStandbysFn()
>>   - SyncRepGetSyncedLsnsFn() function recursively determine synced LSN
>> at that moment using group's method.
>>   - SyncRepGetStandbysFn() function returns standbys of its group,
>> which are considered as sync using group's method.
>>
>> For example, s_s_name  = '3(a, b, 2[c,d]::group1)', SyncRepStandbys
>> memory structure will be,
>>
>> "main(quorum)" --- "a"
>>                         |
>>                         -- "b"
>>                         |
>>                         -- "group1(priority)" --- "c"
>>                                                      |
>>                                                      -- "d"
>>
>> When determine synced LSNs, we need to consider group1's LSN using by
>> priority method at first, and then we can determine main's LSN using
>> by quorum method with "a" LSNs, "b" LSNs and "group1" LSNs.
>> So SyncRepGetSyncedLsnsUsingPriority() function would be,
>>
>> bool
>> SyncRepGetSyncedLsnsUsingPriority(*group, *write_lsn, *flush_lsn)
>> {
>>     sync_num = group->SynRepGetSyncstandbysFn(group, sync_list);
>>
>>     if (sync_num < group->sync_num)
>>         return false;
>>
>>     for (each member of sync_list)
>>     {
>>         if (member->type == group node)
>>             call SyncRepGetSyncedLsnsFn(member, w, f) and store w and
>> f into lsn_list.
>>         else
>>             Store name node LSNs into lsn_list.
>>     }
>>
>>     Determine synced LSNs of this group using lsn_list and priority method.
>>     Store synced LSNs into write_lsn and flush_lsn.
>>     return true;
>> }
>>
>>> SyncRepClearStandbyGroupList is defined in syncrep.c but the
>>> other related functions are defined in syncgroup_gram.y. It would
>>> be better to place them together.
>>
>> SyncRepClearStandbyGroupList() is used by
>> check_synchronous_standby_names(), so I put this function syncrep.c.
>>
>>> SyncRepStandbys are to be in multilevel and the struct is
>>> naturally allowed to be so but SyncRepClearStandbyGroupList
>>> assumes it in single level.
>>
>> Because I think that we don't need to implement to fully support
>> nested style at first version.
>> We have to carefully design this feature while considering
>> expandability, but overkill implementation could be cause of crash.
>> Consider remaining time for 9.6, I feel we could implement quorum
>> method at best.
>>
>>> This is a comment from the aspect of abstractness of objects.
>>> The callers of SyncRepGetSyncStandbysUsingPriority() need to care
>>> the inside of SyncGroupNode but what the function should just
>>> return seems to be the list of wansnds element. Element number is
>>> useless when the SyncGroupNode nests.
>>> > int
>>> > SyncRepGetSyncStandbysUsingPriority(SyncGroupNode *group, volatile WalSnd 
>>> > **sync_list)
>>> This might need to expose 'volatile WalSnd*' (only pointer type)
>>> outside of walsender.
>>> Or it should return the list of index number of
>>> *WalSndCtl->walsnds*.
>>
>> SyncRepGetSyncStandbysUsingPriority() already returns the list of
>> index number of "WalSndCtl->walsnd" as sync_list, no?
>> As I mentioned above, SyncRepGetSyncStandbysFn() doesn't need care the
>> inside of SyncGroupNode in my design.
>> Selecting sync nodes from its group doesn't depend on the type of node.
>> What SyncRepGetSyncStandbyFn() should do is to select sync node from
>> *its* group.
>>
>
> Previous patch has bug around GUC parameter handling.
> Attached updated version.


Thanks for updating the patch!

Now I'm fixing some problems (e.g., current patch doesn't work
with EXEC_BACKEND environment) and revising the patch.
I will post the revised version this weekend or the first half
of next week.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for N synchronous standby servers - take 2

Reply via email to