Re: [HACKERS] Sync Rep v19

2011-04-29 Thread Bruce Momjian
Bruce Momjian wrote: Simon Riggs wrote: On Wed, 2011-03-09 at 21:21 -0500, Bruce Momjian wrote: Simon Riggs wrote: On Fri, 2011-03-04 at 23:15 +0900, Fujii Masao wrote: postgres=# SELECT application_name, state, sync_priority, sync_state FROM pg_stat_replication;

Re: [HACKERS] Sync Rep v19

2011-04-26 Thread Bruce Momjian
Simon Riggs wrote: On Wed, 2011-03-09 at 21:21 -0500, Bruce Momjian wrote: Simon Riggs wrote: On Fri, 2011-03-04 at 23:15 +0900, Fujii Masao wrote: postgres=# SELECT application_name, state, sync_priority, sync_state FROM pg_stat_replication; application_name | state |

Re: [HACKERS] Sync Rep v19

2011-03-28 Thread Simon Riggs
On Thu, Mar 24, 2011 at 11:53 AM, Fujii Masao masao.fu...@gmail.com wrote: On Thu, Mar 24, 2011 at 8:34 PM, Simon Riggs si...@2ndquadrant.com wrote: On Thu, Mar 24, 2011 at 11:17 AM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Mar 23, 2011 at 5:53 PM, Fujii Masao masao.fu...@gmail.com

Re: [HACKERS] Sync Rep v19

2011-03-24 Thread Fujii Masao
On Wed, Mar 23, 2011 at 5:53 PM, Fujii Masao masao.fu...@gmail.com wrote: Do you still want to work up a patch for this?  If so, I can review. Sure. Will do. The attached patch allows standby servers to connect during smart shutdown in order to wake up backends waiting for sync rep. Regards,

Re: [HACKERS] Sync Rep v19

2011-03-24 Thread Simon Riggs
On Thu, Mar 24, 2011 at 11:17 AM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Mar 23, 2011 at 5:53 PM, Fujii Masao masao.fu...@gmail.com wrote: Do you still want to work up a patch for this?  If so, I can review. Sure. Will do. The attached patch allows standby servers to connect during

Re: [HACKERS] Sync Rep v19

2011-03-24 Thread Fujii Masao
On Thu, Mar 24, 2011 at 8:34 PM, Simon Riggs si...@2ndquadrant.com wrote: On Thu, Mar 24, 2011 at 11:17 AM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Mar 23, 2011 at 5:53 PM, Fujii Masao masao.fu...@gmail.com wrote: Do you still want to work up a patch for this?  If so, I can review.

Re: [HACKERS] Sync Rep v19

2011-03-23 Thread Fujii Masao
On Sat, Mar 19, 2011 at 11:28 AM, Robert Haas robertmh...@gmail.com wrote: On Fri, Mar 18, 2011 at 10:25 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Mar 8, 2011 at 7:05 AM, Fujii Masao masao.fu...@gmail.com wrote: * Smart shutdown Smart shutdown should wait for all the waiting

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-22 Thread Yeb Havinga
On 2011-03-21 23:58, Yeb Havinga wrote: On Mon, Mar 21, 2011 at 7:51 PM, Yeb Havinga yebhavi...@gmail.com mailto:yebhavi...@gmail.com wrote: On 2011-03-21 18:04, Robert Haas wrote: On Mon, Mar 21, 2011 at 12:29 PM, Yeb Havingayebhavi...@gmail.com

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-22 Thread Robert Haas
On Tue, Mar 22, 2011 at 3:25 PM, Yeb Havinga yebhavi...@gmail.com wrote: So the patch eats 4,5% from git master's syncrep performance in my setup. Don't know how to measure it better than that. That's quite surprising, but I guess the way forward is clear: don't apply that patch. -- Robert

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-21 Thread Yeb Havinga
On 2011-03-21 02:05, Robert Haas wrote: On Sun, Mar 20, 2011 at 11:03 AM, Yeb Havingayebhavi...@gmail.com wrote: On 2011-03-20 05:44, Robert Haas wrote: Hmm, I'm not going to be able to reproduce this here, and my test setup didn't show a clear regression. I can try beating on it some more,

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-21 Thread Robert Haas
On Mon, Mar 21, 2011 at 12:29 PM, Yeb Havinga yebhavi...@gmail.com wrote: pgbench -i -s 50 test Two runs of pgbench -c 10 -M prepared -T 600 test with 1 sync standby - server configs etc were mailed upthread. - performance as of commit e148443ddd95cd29edf4cc1de6188eb9cee029c5 1158 and 1306

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-21 Thread Yeb Havinga
On 2011-03-21 18:04, Robert Haas wrote: On Mon, Mar 21, 2011 at 12:29 PM, Yeb Havingayebhavi...@gmail.com wrote: pgbench -i -s 50 test Two runs of pgbench -c 10 -M prepared -T 600 test with 1 sync standby - server configs etc were mailed upthread. - performance as of commit

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-21 Thread Yeb Havinga
On Mon, Mar 21, 2011 at 7:51 PM, Yeb Havinga yebhavi...@gmail.com wrote: On 2011-03-21 18:04, Robert Haas wrote: On Mon, Mar 21, 2011 at 12:29 PM, Yeb Havingayebhavi...@gmail.com wrote: pgbench -i -s 50 test Two runs of pgbench -c 10 -M prepared -T 600 test with 1 sync standby - server

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-20 Thread Yeb Havinga
On 2011-03-20 05:44, Robert Haas wrote: Hmm, I'm not going to be able to reproduce this here, and my test setup didn't show a clear regression. I can try beating on it some more, but... Any chance you could rerun your test with the latest master-branch code, and perhaps also with the patch I

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-20 Thread Robert Haas
On Sun, Mar 20, 2011 at 11:03 AM, Yeb Havinga yebhavi...@gmail.com wrote: On 2011-03-20 05:44, Robert Haas wrote: Hmm, I'm not going to be able to reproduce this here, and my test setup didn't show a clear regression.  I can try beating on it some more, but...  Any chance you could rerun your

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-19 Thread Yeb Havinga
On 2011-03-18 18:25, Robert Haas wrote: On Fri, Mar 18, 2011 at 1:15 PM, Simon Riggssi...@2ndquadrant.com wrote: On Thu, 2011-03-17 at 09:33 -0400, Robert Haas wrote: Thanks for the review! Lets have a look here... You've added a test inside the lock to see if there is a standby, which I

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-19 Thread Robert Haas
On Sat, Mar 19, 2011 at 10:32 AM, Yeb Havinga yebhavi...@gmail.com wrote: Testing 'methodology' sounds a bit heavy. I tested a number of patch versions over time, with 30 second, hourly and nightly pgbench runs. The nightly more for durability/memory leak testing than tps numbers, since I

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Robert Haas
On Thu, Mar 17, 2011 at 6:00 PM, Jeff Davis pg...@j-davis.com wrote: On Wed, 2011-03-16 at 13:35 -0400, Robert Haas wrote: 2. If a query cancel interrupt is received (pg_cancel_backend or ^C), then cancel the sync rep wait and issue a warning before acknowledging the commit. When I saw this

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Jeff Davis
On Fri, 2011-03-18 at 08:27 -0400, Robert Haas wrote: On Thu, Mar 17, 2011 at 6:00 PM, Jeff Davis pg...@j-davis.com wrote: On Wed, 2011-03-16 at 13:35 -0400, Robert Haas wrote: 2. If a query cancel interrupt is received (pg_cancel_backend or ^C), then cancel the sync rep wait and issue a

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Robert Haas
On Fri, Mar 18, 2011 at 10:17 AM, Jeff Davis pg...@j-davis.com wrote: On Fri, 2011-03-18 at 08:27 -0400, Robert Haas wrote: On Thu, Mar 17, 2011 at 6:00 PM, Jeff Davis pg...@j-davis.com wrote: On Wed, 2011-03-16 at 13:35 -0400, Robert Haas wrote: 2. If a query cancel interrupt is received

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Jeff Davis
On Fri, 2011-03-18 at 10:27 -0400, Robert Haas wrote: ERRCODE_(WARNING_?)REPLICATION_WAIT_CANCELLED ...which might have something to recommend it. Works for me. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Heikki Linnakangas
On 18.03.2011 17:38, Jeff Davis wrote: On Fri, 2011-03-18 at 10:27 -0400, Robert Haas wrote: ERRCODE_(WARNING_?)REPLICATION_WAIT_CANCELLED ...which might have something to recommend it. Works for me. Yes, sounds reasonable. Without WARNING_, please. -- Heikki Linnakangas EnterpriseDB

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Robert Haas
On Fri, Mar 18, 2011 at 11:58 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 18.03.2011 17:38, Jeff Davis wrote: On Fri, 2011-03-18 at 10:27 -0400, Robert Haas wrote: ERRCODE_(WARNING_?)REPLICATION_WAIT_CANCELLED ...which might have something to recommend it. Works

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Simon Riggs
On Thu, 2011-03-17 at 09:33 -0400, Robert Haas wrote: Thanks for the review! Lets have a look here... You've added a test inside the lock to see if there is a standby, which I took out for performance reasons. Maybe there's another way, I know that code is fiddly. You've also added back in

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Robert Haas
On Fri, Mar 18, 2011 at 1:15 PM, Simon Riggs si...@2ndquadrant.com wrote: On Thu, 2011-03-17 at 09:33 -0400, Robert Haas wrote: Thanks for the review! Lets have a look here... You've added a test inside the lock to see if there is a standby, which I took out for performance reasons. Maybe

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Alvaro Herrera
Excerpts from Robert Haas's message of vie mar 18 14:25:16 -0300 2011: On Fri, Mar 18, 2011 at 1:15 PM, Simon Riggs si...@2ndquadrant.com wrote: SyncRepUpdateSyncStandbysDefined() is added into walwriter, which means waiters won't be released if we do a sighup during a fast shutdown, since

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Robert Haas
On Fri, Mar 18, 2011 at 2:55 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Robert Haas's message of vie mar 18 14:25:16 -0300 2011: On Fri, Mar 18, 2011 at 1:15 PM, Simon Riggs si...@2ndquadrant.com wrote: SyncRepUpdateSyncStandbysDefined() is added into walwriter, which

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Robert Haas
Responding to this again, somewhat out of order... On Fri, Mar 18, 2011 at 1:15 PM, Simon Riggs si...@2ndquadrant.com wrote: Together that's about a 20% hit in performance in Yeb's tests. I think you should spend a little time thinking how to retune that. I've spent some time playing around

Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Robert Haas
On Tue, Mar 8, 2011 at 7:05 AM, Fujii Masao masao.fu...@gmail.com wrote: * Smart shutdown Smart shutdown should wait for all the waiting backends to be acked, and should not cause them to forcibly exit. But this leads shutdown to get stuck infinitely if there is no walsender at that time. To

Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Robert Haas
On Fri, Mar 18, 2011 at 10:25 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Mar 8, 2011 at 7:05 AM, Fujii Masao masao.fu...@gmail.com wrote: * Smart shutdown Smart shutdown should wait for all the waiting backends to be acked, and should not cause them to forcibly exit. But this leads

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-17 Thread Fujii Masao
On Thu, Mar 17, 2011 at 2:35 AM, Robert Haas robertmh...@gmail.com wrote: 1. If a die interrupt is received (pg_terminate_backend or fast shutdown), then terminate the sync rep wait and arrange for the connection to be closed without acknowledging the commit (but do send a warning message

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-17 Thread Heikki Linnakangas
On 16.03.2011 19:35, Robert Haas wrote: 3. If synchronous_standby_names is changed to '' by editing postgresql.conf and issuing pg_ctl reload, then cancel all waits in progress and wake everybody up. As I mentioned before, reloading the config file from within the waiting backend (which can't

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-17 Thread Robert Haas
On Thu, Mar 17, 2011 at 2:08 AM, Fujii Masao masao.fu...@gmail.com wrote: This occurs to me; we should ensure that, in shutdown case, walwriter should exit after all the backends have gone out? I'm not sure if it's worth thinking of the case, but what if synchronous_standby_names is unset and

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-17 Thread Robert Haas
On Thu, Mar 17, 2011 at 8:24 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Hmm, so setting synchronous_standby_names to '' takes effect immediately, but other changes to it don't apply to already-blocked commits. That seems a bit inconsistent. Perhaps walwriter should store

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-17 Thread Jeff Davis
On Wed, 2011-03-16 at 13:35 -0400, Robert Haas wrote: 2. If a query cancel interrupt is received (pg_cancel_backend or ^C), then cancel the sync rep wait and issue a warning before acknowledging the commit. When I saw this commit, I noticed that the WARNING doesn't have an errcode(). It seems

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-16 Thread Simon Riggs
On Tue, 2011-03-15 at 22:07 -0400, Robert Haas wrote: On Wed, Mar 9, 2011 at 11:11 PM, Fujii Masao masao.fu...@gmail.com wrote: Same as above. I think that it's more problematic to leave the code as it is. Because smart/fast shutdown can make the server get stuck until immediate shutdown is

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-16 Thread Robert Haas
On Wed, Mar 16, 2011 at 1:43 AM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Mar 16, 2011 at 11:07 AM, Robert Haas robertmh...@gmail.com wrote: The problem is that there may be another backend B waiting on a lock held by A.  If backend A exits cleanly (without a PANIC), it will remove

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-16 Thread Robert Haas
On Wed, Mar 16, 2011 at 4:51 AM, Simon Riggs si...@2ndquadrant.com wrote: On Tue, 2011-03-15 at 22:07 -0400, Robert Haas wrote: On Wed, Mar 9, 2011 at 11:11 PM, Fujii Masao masao.fu...@gmail.com wrote: Same as above. I think that it's more problematic to leave the code as it is. Because

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-16 Thread Robert Haas
On Wed, Mar 16, 2011 at 7:39 AM, Robert Haas robertmh...@gmail.com wrote: The only idea I have for allowing fast shutdown to still be fast, even when sync rep is involved, is to shut down the system in two phases. The postmaster would need to stop accepting new connections, and first kill off

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-16 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes: 1. If a die interrupt is received (pg_terminate_backend or fast shutdown), then terminate the sync rep wait and arrange for the connection to be closed without acknowledging the commit (but do send a warning message back). The commit still happened,

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-16 Thread Robert Haas
On Wed, Mar 16, 2011 at 6:23 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Robert Haas robertmh...@gmail.com writes: 1. If a die interrupt is received (pg_terminate_backend or fast shutdown), then terminate the sync rep wait and arrange for the connection to be closed without

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-16 Thread Aidan Van Dyk
On Wed, Mar 16, 2011 at 8:30 PM, Robert Haas robertmh...@gmail.com wrote: I think the most important part of all this is that it is logged. Anyone who is running synchronous replication should also be doing careful monitoring; if not, shame on them, because if your data is important enough

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-15 Thread Robert Haas
On Wed, Mar 9, 2011 at 11:11 PM, Fujii Masao masao.fu...@gmail.com wrote: Same as above. I think that it's more problematic to leave the code as it is. Because smart/fast shutdown can make the server get stuck until immediate shutdown is requested. I agree that the current state of affairs is

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-15 Thread Fujii Masao
On Wed, Mar 16, 2011 at 11:07 AM, Robert Haas robertmh...@gmail.com wrote: The problem is that there may be another backend B waiting on a lock held by A.  If backend A exits cleanly (without a PANIC), it will remove itself from the ProcArray and release locks.  That wakes up A, which can now

Re: [HACKERS] Sync Rep v19

2011-03-11 Thread Fujii Masao
On Fri, Mar 11, 2011 at 5:50 AM, Robert Haas robertmh...@gmail.com wrote: On Thu, Mar 10, 2011 at 3:29 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Robert Haas robertmh...@gmail.com writes: they are, but there's no easy way to figure out what that means in terms of wall-clock time,

Re: [HACKERS] Sync Rep v19

2011-03-11 Thread Robert Haas
On Fri, Mar 11, 2011 at 7:08 AM, Fujii Masao masao.fu...@gmail.com wrote: On Fri, Mar 11, 2011 at 5:50 AM, Robert Haas robertmh...@gmail.com wrote: On Thu, Mar 10, 2011 at 3:29 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Robert Haas robertmh...@gmail.com writes: they are, but there's

Re: [HACKERS] Sync Rep v19

2011-03-11 Thread Fujii Masao
On Fri, Mar 11, 2011 at 10:02 PM, Robert Haas robertmh...@gmail.com wrote: How about sending the timestamp of last applied transaction (i.e., this is the return value of pg_last_xact_replay_timestamp) from the standby to the master, and reporting it in pg_stat_replication? Then you can see the

Re: [HACKERS] Sync Rep v19

2011-03-11 Thread Robert Haas
On Fri, Mar 11, 2011 at 8:21 AM, Fujii Masao masao.fu...@gmail.com wrote: On Fri, Mar 11, 2011 at 10:02 PM, Robert Haas robertmh...@gmail.com wrote: How about sending the timestamp of last applied transaction (i.e., this is the return value of pg_last_xact_replay_timestamp) from the standby to

Re: [HACKERS] Sync Rep v19

2011-03-11 Thread Ross J. Reedstrom
On Fri, Mar 11, 2011 at 09:03:33AM -0500, Robert Haas wrote: On Fri, Mar 11, 2011 at 8:21 AM, Fujii Masao masao.fu...@gmail.com wrote: In that case, the last write WAL timestamp would become equal to the last replay WAL timestamp. So we can see that there is no lag. Oh, I see (I think).

Re: [HACKERS] Sync Rep v19

2011-03-10 Thread Robert Haas
On Wed, Mar 9, 2011 at 9:21 PM, Bruce Momjian br...@momjian.us wrote: Simon Riggs wrote: On Fri, 2011-03-04 at 23:15 +0900, Fujii Masao wrote: postgres=# SELECT application_name, state, sync_priority, sync_state FROM pg_stat_replication;  application_name |   state   | sync_priority |

Re: [HACKERS] Sync Rep v19

2011-03-10 Thread Simon Riggs
On Wed, 2011-03-09 at 21:21 -0500, Bruce Momjian wrote: Simon Riggs wrote: On Fri, 2011-03-04 at 23:15 +0900, Fujii Masao wrote: postgres=# SELECT application_name, state, sync_priority, sync_state FROM pg_stat_replication; application_name | state | sync_priority | sync_state

Re: [HACKERS] Sync Rep v19

2011-03-10 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes: was. So you could then say things like is the most recent time at which the standby was caught up within the last 30 seconds?, which would be a useful thing to monitor, and right now there's no way to do Well in my experience with replication, that's

Re: [HACKERS] Sync Rep v19

2011-03-10 Thread Robert Haas
On Thu, Mar 10, 2011 at 2:42 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Robert Haas robertmh...@gmail.com writes: was.  So you could then say things like is the most recent time at which the standby was caught up within the last 30 seconds?, which would be a useful thing to monitor,

Re: [HACKERS] Sync Rep v19

2011-03-10 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes: they are, but there's no easy way to figure out what that means in terms of wall-clock time, which I think would be useful. Jan Wieck had a detailed proposal to make that happen at last developper meeting, but then ran out of time to implement it for

Re: [HACKERS] Sync Rep v19

2011-03-10 Thread Robert Haas
On Thu, Mar 10, 2011 at 3:29 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Robert Haas robertmh...@gmail.com writes: they are, but there's no easy way to figure out what that means in terms of wall-clock time, which I think would be useful. Jan Wieck had a detailed proposal to make that

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-09 Thread Yeb Havinga
On 2011-03-09 08:38, Fujii Masao wrote: On Wed, Mar 9, 2011 at 2:14 PM, Jaime Casanovaja...@2ndquadrant.com wrote: On Tue, Mar 8, 2011 at 11:58 AM, Robert Haasrobertmh...@gmail.com wrote: The fast shutdown handling seems fine, but why not just handle smart shutdown the same way? currently,

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-09 Thread Magnus Hagander
On Wed, Mar 9, 2011 at 08:38, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Mar 9, 2011 at 2:14 PM, Jaime Casanova ja...@2ndquadrant.com wrote: On Tue, Mar 8, 2011 at 11:58 AM, Robert Haas robertmh...@gmail.com wrote: The fast shutdown handling seems fine, but why not just handle smart

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-09 Thread Simon Riggs
On Wed, 2011-03-09 at 16:38 +0900, Fujii Masao wrote: On Wed, Mar 9, 2011 at 2:14 PM, Jaime Casanova ja...@2ndquadrant.com wrote: On Tue, Mar 8, 2011 at 11:58 AM, Robert Haas robertmh...@gmail.com wrote: The fast shutdown handling seems fine, but why not just handle smart shutdown the

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-09 Thread Yeb Havinga
On 2011-03-09 15:10, Simon Riggs wrote: On Wed, 2011-03-09 at 16:38 +0900, Fujii Masao wrote: On Wed, Mar 9, 2011 at 2:14 PM, Jaime Casanovaja...@2ndquadrant.com wrote: On Tue, Mar 8, 2011 at 11:58 AM, Robert Haasrobertmh...@gmail.com wrote: The fast shutdown handling seems fine, but why

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-09 Thread Simon Riggs
On Wed, 2011-03-09 at 15:37 +0100, Yeb Havinga wrote: The current situation is definately unsafe because it forces people that are in this state to do a fast shutdown.. but that fails as well, so they are only left with immediate. All the more reason not to change anything, since we disagree.

Re: [HACKERS] Sync Rep v19

2011-03-09 Thread Bruce Momjian
Simon Riggs wrote: On Fri, 2011-03-04 at 23:15 +0900, Fujii Masao wrote: postgres=# SELECT application_name, state, sync_priority, sync_state FROM pg_stat_replication; application_name | state | sync_priority | sync_state --+---+---+

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-09 Thread Fujii Masao
On Thu, Mar 10, 2011 at 12:03 AM, Simon Riggs si...@2ndquadrant.com wrote: On Wed, 2011-03-09 at 15:37 +0100, Yeb Havinga wrote: The current situation is definately unsafe because it forces people that are in this state to do a fast shutdown.. but that fails as well, so they are only left

Re: [HACKERS] Sync Rep v19

2011-03-08 Thread Fujii Masao
On Mon, Mar 7, 2011 at 4:54 AM, Robert Haas robertmh...@gmail.com wrote: On Mar 6, 2011, at 9:44 AM, Fujii Masao masao.fu...@gmail.com wrote: On Sun, Mar 6, 2011 at 5:02 PM, Yeb Havinga yebhavi...@gmail.com wrote: On Sun, Mar 6, 2011 at 8:58 AM, Fujii Masao masao.fu...@gmail.com wrote: If

Re: [HACKERS] Sync Rep v19

2011-03-08 Thread Robert Haas
On Tue, Mar 8, 2011 at 7:05 AM, Fujii Masao masao.fu...@gmail.com wrote: Yeah, let's think about how shutdown should work. I'd like to propose the following. Thought? * Smart shutdown Smart shutdown should wait for all the waiting backends to be acked, and should not cause them to forcibly

Re: [HACKERS] Sync Rep v19

2011-03-08 Thread Jaime Casanova
On Tue, Mar 8, 2011 at 11:58 AM, Robert Haas robertmh...@gmail.com wrote: The fast shutdown handling seems fine, but why not just handle smart shutdown the same way? currently, smart shutdown means no new connections, wait until existing ones close normally. for consistency, it should behave

Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-08 Thread Fujii Masao
On Wed, Mar 9, 2011 at 2:14 PM, Jaime Casanova ja...@2ndquadrant.com wrote: On Tue, Mar 8, 2011 at 11:58 AM, Robert Haas robertmh...@gmail.com wrote: The fast shutdown handling seems fine, but why not just handle smart shutdown the same way? currently, smart shutdown means no new

Re: [HACKERS] Sync Rep v19

2011-03-07 Thread Yeb Havinga
On 2011-03-07 01:37, Simon Riggs wrote: On Sat, 2011-03-05 at 21:11 +0100, Yeb Havinga wrote: I also got a first first 1000 tps score The committed version should be even faster. Would appreciate a retest. pgbench 5 minute test pgbench -c 10 -M prepared -T 300 test dbsize was -s 50, 1Gbit

Re: [HACKERS] Sync Rep v19

2011-03-07 Thread Simon Riggs
On Mon, 2011-03-07 at 14:20 +0100, Yeb Havinga wrote: On 2011-03-07 01:37, Simon Riggs wrote: On Sat, 2011-03-05 at 21:11 +0100, Yeb Havinga wrote: I also got a first first 1000 tps score The committed version should be even faster. Would appreciate a retest. pgbench 5 minute test

Re: [HACKERS] Sync Rep v19

2011-03-06 Thread Simon Riggs
On Sun, 2011-03-06 at 14:27 +0900, Fujii Masao wrote: On Sun, Mar 6, 2011 at 2:59 AM, Robert Haas robertmh...@gmail.com wrote: On Sat, Mar 5, 2011 at 11:56 AM, Simon Riggs si...@2ndquadrant.com wrote: Even though postmaster dies, the waiting backend keeps waiting until the timeout expires.

Re: [HACKERS] Sync Rep v19

2011-03-06 Thread Simon Riggs
On Sun, 2011-03-06 at 16:58 +0900, Fujii Masao wrote: On Sun, Mar 6, 2011 at 4:51 PM, Fujii Masao masao.fu...@gmail.com wrote: One comment; what about introducing built-in function to wake up all the waiting backends? When replication connection is closed, if we STONITH the standby, we can

Re: [HACKERS] Sync Rep v19

2011-03-06 Thread Simon Riggs
On Sun, 2011-03-06 at 01:58 +0900, Fujii Masao wrote: On Sun, Mar 6, 2011 at 12:42 AM, Fujii Masao masao.fu...@gmail.com wrote: New comments; Another one; + longtimeout = SyncRepGetWaitTimeout(); snip + else if (timeout 0 +

Re: [HACKERS] Sync Rep v19

2011-03-06 Thread Fujii Masao
On Sun, Mar 6, 2011 at 5:26 PM, Simon Riggs si...@2ndquadrant.com wrote: On Sun, 2011-03-06 at 16:58 +0900, Fujii Masao wrote: On Sun, Mar 6, 2011 at 4:51 PM, Fujii Masao masao.fu...@gmail.com wrote: One comment; what about introducing built-in function to wake up all the waiting backends?

Re: [HACKERS] Sync Rep v19

2011-03-06 Thread Fujii Masao
On Sun, Mar 6, 2011 at 5:02 PM, Yeb Havinga yebhavi...@gmail.com wrote: On Sun, Mar 6, 2011 at 8:58 AM, Fujii Masao masao.fu...@gmail.com wrote: If unfortunately all connection slots are used by backends waiting for replication, we cannot execute such a function. So it makes more sense to

Re: [HACKERS] Sync Rep v19

2011-03-06 Thread Jaime Casanova
El 06/03/2011 03:26, Simon Riggs si...@2ndquadrant.com escribió: On Sun, 2011-03-06 at 16:58 +0900, Fujii Masao wrote: If unfortunately all connection slots are used by backends waiting for replication, we cannot execute such a function. So it makes more sense to introduce something like

Re: [HACKERS] Sync Rep v19

2011-03-06 Thread Robert Haas
On Mar 6, 2011, at 9:44 AM, Fujii Masao masao.fu...@gmail.com wrote: On Sun, Mar 6, 2011 at 5:02 PM, Yeb Havinga yebhavi...@gmail.com wrote: On Sun, Mar 6, 2011 at 8:58 AM, Fujii Masao masao.fu...@gmail.com wrote: If unfortunately all connection slots are used by backends waiting for

Re: [HACKERS] Sync Rep v19

2011-03-06 Thread Simon Riggs
On Sun, 2011-03-06 at 16:51 +0900, Fujii Masao wrote: One comment; what about introducing built-in function to wake up all the waiting backends? When replication connection is closed, if we STONITH the standby, we can safely (for not physical data loss but logical one) switch the primary to

Re: [HACKERS] Sync Rep v19

2011-03-06 Thread Simon Riggs
On Sat, 2011-03-05 at 21:11 +0100, Yeb Havinga wrote: I also got a first first 1000 tps score The committed version should be even faster. Would appreciate a retest. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Simon Riggs
On Sat, 2011-03-05 at 16:13 +0900, Fujii Masao wrote: On Sat, Mar 5, 2011 at 7:28 AM, Simon Riggs si...@2ndquadrant.com wrote: Almost-working patch attached for the above feature. Time to stop for the day. Patch against current repo version. Current repo version attached here also (v20),

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Fujii Masao
On Sat, Mar 5, 2011 at 7:28 AM, Simon Riggs si...@2ndquadrant.com wrote: Yes, that can happen. As people will no doubt observe, this seems to be an argument for wait-forever. What we actually need is a wait that lasts longer than it takes for us to decide to failover, if the standby is

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Simon Riggs
On Sat, 2011-03-05 at 11:04 +, Simon Riggs wrote: + /* + * Assume the queue is ordered by LSN + */ + if (XLByteLT(walsndctl-lsn, proc-waitLSN)) + return numprocs; The code to ensure the assumption needs to be

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Robert Haas
On Sat, Mar 5, 2011 at 6:04 AM, Simon Riggs si...@2ndquadrant.com wrote: It is documented that the selection of standby from a set of similar priorities is indeterminate. Users don't like it, they can change it. That doesn't seem like a good argument to *change* the synchronous standby once

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Simon Riggs
On Sat, 2011-03-05 at 07:24 -0500, Robert Haas wrote: On Sat, Mar 5, 2011 at 6:04 AM, Simon Riggs si...@2ndquadrant.com wrote: It is documented that the selection of standby from a set of similar priorities is indeterminate. Users don't like it, they can change it. That doesn't seem like a

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Robert Haas
On Sat, Mar 5, 2011 at 7:49 AM, Simon Riggs si...@2ndquadrant.com wrote: On Sat, 2011-03-05 at 07:24 -0500, Robert Haas wrote: On Sat, Mar 5, 2011 at 6:04 AM, Simon Riggs si...@2ndquadrant.com wrote: It is documented that the selection of standby from a set of similar priorities is

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Yeb Havinga
On Sat, Mar 5, 2011 at 2:05 PM, Robert Haas robertmh...@gmail.com wrote: On Sat, Mar 5, 2011 at 7:49 AM, Simon Riggs si...@2ndquadrant.com wrote: If the order is arbitrary, why does it matter if it changes? The user has the power to specify a sequence, yet they have not done so. They are

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Simon Riggs
On Sat, 2011-03-05 at 14:44 +0100, Yeb Havinga wrote: On Sat, Mar 5, 2011 at 2:05 PM, Robert Haas robertmh...@gmail.com wrote: On Sat, Mar 5, 2011 at 7:49 AM, Simon Riggs si...@2ndquadrant.com wrote: If the order is arbitrary, why does it matter if it changes?

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Simon Riggs
On Sat, 2011-03-05 at 20:08 +0900, Fujii Masao wrote: On Sat, Mar 5, 2011 at 7:28 AM, Simon Riggs si...@2ndquadrant.com wrote: Yes, that can happen. As people will no doubt observe, this seems to be an argument for wait-forever. What we actually need is a wait that lasts longer than it

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Fujii Masao
On Sat, Mar 5, 2011 at 9:21 PM, Simon Riggs si...@2ndquadrant.com wrote: I've added code to shmqueue.c to allow this. New version pushed. New comments; It looks odd to report the sync_state of walsender in BACKUP state as ASYNC. +SyncRepCleanupAtProcExit(int code, Datum arg) +{ + if

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Fujii Masao
On Sun, Mar 6, 2011 at 12:07 AM, Simon Riggs si...@2ndquadrant.com wrote: I'm not in favour. If the user has a preferred order, they can specify it. If there is no preferred order, how will we maintain that order? What are the rules for maintaining this arbitrary order? Probably what

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Robert Haas
On Mar 5, 2011, at 11:17 AM, Fujii Masao masao.fu...@gmail.com wrote: On Sun, Mar 6, 2011 at 12:07 AM, Simon Riggs si...@2ndquadrant.com wrote: I'm not in favour. If the user has a preferred order, they can specify it. If there is no preferred order, how will we maintain that order? What

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Jaime Casanova
El 05/03/2011 11:18, Fujii Masao masao.fu...@gmail.com escribió: On Sun, Mar 6, 2011 at 12:07 AM, Simon Riggs si...@2ndquadrant.com wrote: I'm not in favour. If the user has a preferred order, they can specify it. If there is no preferred order, how will we maintain that order? What

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Simon Riggs
On Sun, 2011-03-06 at 01:17 +0900, Fujii Masao wrote: On Sun, Mar 6, 2011 at 12:07 AM, Simon Riggs si...@2ndquadrant.com wrote: I'm not in favour. If the user has a preferred order, they can specify it. If there is no preferred order, how will we maintain that order? What are the

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Simon Riggs
On Sat, 2011-03-05 at 11:42 -0500, Jaime Casanova wrote: El 05/03/2011 11:18, Fujii Masao masao.fu...@gmail.com escribió: On Sun, Mar 6, 2011 at 12:07 AM, Simon Riggs si...@2ndquadrant.com wrote: I'm not in favour. If the user has a preferred order, they can specify it. If there is

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Simon Riggs
On Sun, 2011-03-06 at 00:42 +0900, Fujii Masao wrote: On Sat, Mar 5, 2011 at 9:21 PM, Simon Riggs si...@2ndquadrant.com wrote: I've added code to shmqueue.c to allow this. New version pushed. New comments; None of the requested changes are in v21, as yet. It looks odd to report the

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Fujii Masao
On Sun, Mar 6, 2011 at 12:42 AM, Fujii Masao masao.fu...@gmail.com wrote: New comments; Another one; + longtimeout = SyncRepGetWaitTimeout(); snip + else if (timeout 0 + TimestampDifferenceExceeds(wait_start, now,

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Yeb Havinga
On Sat, Mar 5, 2011 at 5:53 PM, Simon Riggs si...@2ndquadrant.com wrote: On a positive note this is one less parameter and will improve performance as well. All above changes made. Ready to commit, barring concrete objections to important behaviour. I will do one final check tomorrow

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Robert Haas
On Sat, Mar 5, 2011 at 11:56 AM, Simon Riggs si...@2ndquadrant.com wrote: Even though postmaster dies, the waiting backend keeps waiting until the timeout expires. Instead, the backends should periodically check whether postmaster is alive, and then they should exit immediately if it's not

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Yeb Havinga
On 2011-03-05 18:25, Yeb Havinga wrote: On Sat, Mar 5, 2011 at 5:53 PM, Simon Riggs si...@2ndquadrant.com mailto:si...@2ndquadrant.com wrote: On a positive note this is one less parameter and will improve performance as well. All above changes made. Ready to commit, barring

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Yeb Havinga
On 2011-03-05 21:11, Yeb Havinga wrote: Summary of preliminary testing: 1) it is confusing to show messages/ contents of stat_replication that hints at syncrep, when synchronous_replication is on. s/on/off/ Also forgot to mention these tests are againt the latest v21 syncrep patch. Y

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Jaime Casanova
On Sat, Mar 5, 2011 at 3:11 PM, Yeb Havinga yebhavi...@gmail.com wrote: Summary of preliminary testing: 1) it is confusing to show messages/ contents of stat_replication that hints at syncrep, when synchronous_replication is on. [for the record, Yeb explain he means OFF not on...] the thing

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Fujii Masao
On Sun, Mar 6, 2011 at 2:59 AM, Robert Haas robertmh...@gmail.com wrote: On Sat, Mar 5, 2011 at 11:56 AM, Simon Riggs si...@2ndquadrant.com wrote: Even though postmaster dies, the waiting backend keeps waiting until the timeout expires. Instead, the backends should periodically check whether

  1   2   >