Re: [HACKERS] Support for N synchronous standby servers - take 2

Michael Paquier Mon, 18 Jan 2016 21:56:01 -0800

On Tue, Jan 19, 2016 at 1:40 AM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Mon, Jan 18, 2016 at 1:20 PM, Michael Paquier
> <michael.paqu...@gmail.com> wrote:
>> On Sun, Jan 17, 2016 at 11:09 PM, Masahiko Sawada <sawada.m...@gmail.com> 
>> wrote:
>>> On Wed, Jan 13, 2016 at 1:54 AM, Alvaro Herrera wrote:
>>> * Confirm value of pg_stat_replication.sync_state (sync, async or potential)
>>> * Confirm that the data is synchronously replicated to multiple
>>> standbys in same cases.
>>>   * case 1 : The standby which is not listed in s_s_name, is down
>>>   * case 2 : The standby which is listed in s_s_names but potential
>>> standby, is down
>>>   * case 3 : The standby which is considered as sync standby, is down.
>>> * Standby promotion
>>>
>>> In order to confirm that the commit isn't done in case #3 forever
>>> unless new sync standby is up, I think we need the framework that
>>> cancels executing query.
>>> That is, what I'm planning is,
>>> 1. Set up master server (s_s_name = '2, standby1, standby2)
>>> 2. Set up two standby servers
>>> 3. Standby1 is down
>>> 4. Create some contents on master (But transaction is not committed)
>>> 5. Cancel the #4 query. (Also confirm that the flush location of only
>>> standby2 makes progress)
>>
>> This will need some thinking and is not as easy as it sounds. There is
>> no way to hold on a connection after executing a query in the current
>> TAP infrastructure. You are just mentioning case 3, but actually cases
>> 1 and 2 are falling into the same need: if there is a failure we want
>> to be able to not be stuck in the test forever and have a way to
>> cancel a query execution at will. TAP uses psql -c to execute any sql
>> queries, but we would need something that is far lower-level, and that
>> would be basically using the perl driver for Postgres or an equivalent
>> here.
>>
>> Honestly for those tests I just thought that we could get to something
>> reliable by just looking at how each sync replication setup reflects
>> in pg_stat_replication as the flow is really getting complicated,
>> giving to the user a clear representation at SQL level of what is
>> actually occurring in the server depending on the configuration used
>> being important here.
>
> I see.
> We could check the transition of sync_state in pg_stat_replication.
> I think it means that it tests for each replication method (switching
> state) rather than synchronization of replication.
>
> What I'm planning to have are,
> * Confirm value of pg_stat_replication.sync_state (sync, async or potential)
> * Standby promotion
> * Standby catching up master
> And each replication method has above tests.
>
> Are these enough?


Does promoting the standby and checking that it caught really have
value in this context of this patch? What we just want to know is on a
master, which nodes need to be waited for when s_s_names or any other
method is used, no?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for N synchronous standby servers - take 2

Reply via email to