subject:"Re\: \[HACKERS\] sync\(\)"

Re: [HACKERS] sync process names between ps and pg_stat_activity

2017-09-20 Thread Peter Eisentraut

On 9/19/17 21:30, Tsunakawa, Takayuki wrote:
> From: pgsql-hackers-ow...@postgresql.org
>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Peter Eisentraut
>>> Personally, I prefer "wal writer", "wal sender" and "wal receiver"
>>> that separate words as other process names.  But I don't mind leaving
>>> them as they are now.
>>
>> If we were to change those, that would break existing queries for
>> pg_stat_activity.  That's new in PG10, so we could change it if we were
>> really eager.  But it's probably not worth bothering.  Then again, there
>> is pg_stat_wal_receiver.  So it's all totally inconsistent.  Not sure
>> where to go.
> 
> OK, I'm comfortable with as it is now.
> 
> I made this ready for committer.  You can fix the following and commit the 
> patch.  Thank you.

Committed.  Thank you.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sync process names between ps and pg_stat_activity

2017-09-19 Thread Tom Lane

Michael Paquier  writes:
> On Fri, Sep 1, 2017 at 5:33 AM, Tom Lane  wrote:
>> Peter Eisentraut  writes:
>>> As an aside, is there a reason why the archiver process is not included
>>> in pg_stat_activity?

>> It's not connected to shared memory.

> Do you think that monitoring would be a reason sufficient to do so? My
> personal opinion on the matter is that people are more and more going
> to move on with pull (*) models (aka pg_receivewal and such with
> replication slots) instead of push (*) models (use of
> archive_command), so that monitoring of the archiver becomes less and
> less useful in the long-term. And there is also pg_stat_archiver that
> covers largely the gap for archive failures.

Meh.  I think keeping it separate from shared memory is a good thing
for reliability reasons.

> Still, one reason that could be used to connect it to shared memory is
> to control the interval of time used for archive attempts, which is
> now a interval hardcoded of 1s in pgarch_ArchiverCopyLoop(). Here more
> flexibility would be definitely useful.

AFAIR, GUC reloads work regardless of that.  If we wanted to make the
interval configurable, this would not prevent us from doing so.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sync process names between ps and pg_stat_activity

2017-09-19 Thread Michael Paquier

On Fri, Sep 1, 2017 at 5:33 AM, Tom Lane  wrote:
> Peter Eisentraut  writes:
>> As an aside, is there a reason why the archiver process is not included
>> in pg_stat_activity?
>
> It's not connected to shared memory.

Do you think that monitoring would be a reason sufficient to do so? My
personal opinion on the matter is that people are more and more going
to move on with pull (*) models (aka pg_receivewal and such with
replication slots) instead of push (*) models (use of
archive_command), so that monitoring of the archiver becomes less and
less useful in the long-term. And there is also pg_stat_archiver that
covers largely the gap for archive failures.

Still, one reason that could be used to connect it to shared memory is
to control the interval of time used for archive attempts, which is
now a interval hardcoded of 1s in pgarch_ArchiverCopyLoop(). Here more
flexibility would be definitely useful.

(*): this wording is from a colleague, not from me.
-- 
Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sync process names between ps and pg_stat_activity

2017-09-19 Thread Tsunakawa, Takayuki

From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Peter Eisentraut
> > Personally, I prefer "wal writer", "wal sender" and "wal receiver"
> > that separate words as other process names.  But I don't mind leaving
> > them as they are now.
> 
> If we were to change those, that would break existing queries for
> pg_stat_activity.  That's new in PG10, so we could change it if we were
> really eager.  But it's probably not worth bothering.  Then again, there
> is pg_stat_wal_receiver.  So it's all totally inconsistent.  Not sure
> where to go.

OK, I'm comfortable with as it is now.

I made this ready for committer.  You can fix the following and commit the 
patch.  Thank you.


> >   * To achieve that, we pass "wal sender process" as username and
> > username
> 
> good catch

Regards
Takayuki Tsunakawa


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sync process names between ps and pg_stat_activity

2017-09-19 Thread Peter Eisentraut

On 9/18/17 02:07, MauMau wrote:
> (1)
> In the following comment, it's better to change "wal sender process"
> to "walsender" to follow the modified name.
> 
> - * postgres: wal sender process   
> + * postgres: walsender   
>   *
>   * To achieve that, we pass "wal sender process" as username and
> username

good catch

> (2)
> "WAL writer process" is used, not "walwriter", is used in postmaster.c
> as follows.  I guess this is for natural language.  Is this intended?
> I'm OK with either, though.
> 
> HandleChildCrash(pid, exitstatus,
>  _("WAL writer process"));

Yes, we usually use that spelling in user-facing messages.

> Personally, I prefer "wal writer", "wal sender" and "wal receiver"
> that separate words as other process names.  But I don't mind leaving
> them as they are now.

If we were to change those, that would break existing queries for
pg_stat_activity.  That's new in PG10, so we could change it if we were
really eager.  But it's probably not worth bothering.  Then again, there
is pg_stat_wal_receiver.  So it's all totally inconsistent.  Not sure
where to go.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sync process names between ps and pg_stat_activity

2017-09-18 Thread MauMau

From: Peter Eisentraut
> The process names shown in pg_stat_activity.backend_type as of PG10
and
> the process names used in the ps display are in some cases
gratuitously
> different, so here is a patch to make them more alike.  Of course it
> could be debated in some cases which spelling was better.

(1)
In the following comment, it's better to change "wal sender process"
to "walsender" to follow the modified name.

- * postgres: wal sender process   
+ * postgres: walsender   
  *
  * To achieve that, we pass "wal sender process" as username and
username
  * as dbname to init_ps_display(). XXX: should add a new variant
of
  * init_ps_display() to avoid abusing the parameters like this.
  */


(2)
"WAL writer process" is used, not "walwriter", is used in postmaster.c
as follows.  I guess this is for natural language.  Is this intended?
I'm OK with either, though.

HandleChildCrash(pid, exitstatus,
 _("WAL writer process"));

case WalWriterProcess:
ereport(LOG,
(errmsg("could not fork WAL writer process:
%m")));


Personally, I prefer "wal writer", "wal sender" and "wal receiver"
that separate words as other process names.  But I don't mind leaving
them as they are now.  I'd like to make this as ready for committer
when I get some reply.

Regards
MauMau



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sync process names between ps and pg_stat_activity

2017-08-31 Thread Tom Lane

Peter Eisentraut  writes:
> As an aside, is there a reason why the archiver process is not included
> in pg_stat_activity?

It's not connected to shared memory.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync timezone code with upstream release tzcode2016c

2016-04-28 Thread Tom Lane

Bruce Momjian  writes:
> On Sun, Mar 27, 2016 at 05:14:44PM -0400, Tom Lane wrote:
>> A note about comparing this to upstream: I found that the best
>> way to do that was to run the IANA source files through a sed
>> filter like this: ...

> Is this documented for use next time?

Yes, see the README.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync timezone code with upstream release tzcode2016c

2016-04-28 Thread Bruce Momjian

On Sun, Mar 27, 2016 at 05:14:44PM -0400, Tom Lane wrote:
> Well, that was just about as tedious as I feared it might be, but
> attached is a patch for $SUBJECT.  We should apply this, and
> probably eventually back-patch it, but it'd be wise to let it
> age awhile in HEAD first.  Is anyone interested in reviewing it,
> or shall I just push it and see what the buildfarm thinks?
> 
> A note about comparing this to upstream: I found that the best
> way to do that was to run the IANA source files through a sed
> filter like this:
> 
> sed -r \
>   -e 's/^([   ]*)\*\*([   ])/\1 *\2/' \
>   -e 's/^([   ]*)\*\*$/\1 */' \
>   -e 's|^\*/| */|' \
>   -e 's/register //g' \
>   -e 's/int_fast32_t/int32/g' \
>   -e 's/int_fast64_t/int64/g' \
>   -e 's/struct tm\b/struct pg_tm/g' \
>   -e 's/\btime_t\b/pg_time_t/g' \
> 
> and then pgindent them.  (If you pgindent without this, it'll make
> a hash of their preferred block-comment format with double **'s.
> As long as I had to do that, I figured I could make the filter
> deal with substituting typedef names and getting rid of their
> overenthusiasm for "register".)

Is this documented for use next time?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-04-29 Thread Bruce Momjian

Bruce Momjian wrote:
 Simon Riggs wrote:
  On Wed, 2011-03-09 at 21:21 -0500, Bruce Momjian wrote:
   Simon Riggs wrote:
On Fri, 2011-03-04 at 23:15 +0900, Fujii Masao wrote:

 postgres=# SELECT application_name, state, sync_priority, sync_state
 FROM pg_stat_replication;
  application_name |   state   | sync_priority | sync_state
 --+---+---+
  one  | STREAMING | 1 | POTENTIAL
  two  | streaming | 2 | sync
 (2 rows)

Bug! Thanks.
   
   Is there a reason these status are all upper-case?
  
  NOT AS FAR AS I KNOW.
  
  I'll add it to the list of changes for beta.
 
 The attached patch lowercases the labels displayed in the view above. 
 (The example above was originally all upper-case.)

Applied.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-04-26 Thread Bruce Momjian

Simon Riggs wrote:
 On Wed, 2011-03-09 at 21:21 -0500, Bruce Momjian wrote:
  Simon Riggs wrote:
   On Fri, 2011-03-04 at 23:15 +0900, Fujii Masao wrote:
   
postgres=# SELECT application_name, state, sync_priority, sync_state
FROM pg_stat_replication;
 application_name |   state   | sync_priority | sync_state
--+---+---+
 one  | STREAMING | 1 | POTENTIAL
 two  | streaming | 2 | sync
(2 rows)
   
   Bug! Thanks.
  
  Is there a reason these status are all upper-case?
 
 NOT AS FAR AS I KNOW.
 
 I'll add it to the list of changes for beta.

The attached patch lowercases the labels displayed in the view above. 
(The example above was originally all upper-case.)

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
new file mode 100644
index af3c95a..470e6d1
*** a/src/backend/replication/walsender.c
--- b/src/backend/replication/walsender.c
*** WalSndGetStateString(WalSndState state)
*** 1350,1362 
  	switch (state)
  	{
  		case WALSNDSTATE_STARTUP:
! 			return STARTUP;
  		case WALSNDSTATE_BACKUP:
! 			return BACKUP;
  		case WALSNDSTATE_CATCHUP:
! 			return CATCHUP;
  		case WALSNDSTATE_STREAMING:
! 			return STREAMING;
  	}
  	return UNKNOWN;
  }
--- 1350,1362 
  	switch (state)
  	{
  		case WALSNDSTATE_STARTUP:
! 			return startup;
  		case WALSNDSTATE_BACKUP:
! 			return backup;
  		case WALSNDSTATE_CATCHUP:
! 			return catchup;
  		case WALSNDSTATE_STREAMING:
! 			return streaming;
  	}
  	return UNKNOWN;
  }
*** pg_stat_get_wal_senders(PG_FUNCTION_ARGS
*** 1501,1511 
  			 * informational, not different from priority.
  			 */
  			if (sync_priority[i] == 0)
! values[7] = CStringGetTextDatum(ASYNC);
  			else if (i == sync_standby)
! values[7] = CStringGetTextDatum(SYNC);
  			else
! values[7] = CStringGetTextDatum(POTENTIAL);
  		}
  
  		tuplestore_putvalues(tupstore, tupdesc, values, nulls);
--- 1501,1511 
  			 * informational, not different from priority.
  			 */
  			if (sync_priority[i] == 0)
! values[7] = CStringGetTextDatum(async);
  			else if (i == sync_standby)
! values[7] = CStringGetTextDatum(sync);
  			else
! values[7] = CStringGetTextDatum(potential);
  		}
  
  		tuplestore_putvalues(tupstore, tupdesc, values, nulls);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sync rep and smart shutdown

2011-04-10 Thread Fujii Masao

On Sat, Apr 9, 2011 at 3:53 AM, Robert Haas robertmh...@gmail.com wrote:
 There are a couple of plausible ways to proceed here:

 1. Do nothing.

 2. When a smart shutdown is initiated, shut off synchronous
 replication.

 3. Accept new replication connections even when the system is
 undergoing a smart shutdown.

 I agree that #3 is impractical and #2 is a bad idea, which seems to
 leave us with #1 (unless anyone has a #4)?  This is probably just
 something we should figure is going to be one of the rough edges
 in the first release of sync rep.

 That's kind of where my mind was headed too, although I was (probably
 vainly) hoping for a better option.

Though I proposed #3, I can live with #1 for now. Even if smart shutdown
gets stuck, we can resolve that by requesting fast shutdown or emptying
synchronous_standby_names.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sync rep and smart shutdown

2011-04-08 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 There is an open item for synchronous replication and smart shutdown,
 with a link to here:
 http://archives.postgresql.org/pgsql-hackers/2011-03/msg01391.php

 There are a couple of plausible ways to proceed here:

 1. Do nothing.

 2. When a smart shutdown is initiated, shut off synchronous
 replication.

 3. Accept new replication connections even when the system is
 undergoing a smart shutdown.

I agree that #3 is impractical and #2 is a bad idea, which seems to
leave us with #1 (unless anyone has a #4)?  This is probably just
something we should figure is going to be one of the rough edges
in the first release of sync rep.

A #4 idea did just come to mind: once we realize that there are no
working replication connections, automatically do a fast shutdown
instead, ie, forcibly roll back those transactions that are never
gonna complete.  Or at least have the postmaster bleat about it.
But I'm not sure what it'd take to code that, and am also unsure
that it's something to undertake at this stage of the cycle.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sync rep and smart shutdown

2011-04-08 Thread Robert Haas

On Fri, Apr 8, 2011 at 2:38 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 There is an open item for synchronous replication and smart shutdown,
 with a link to here:
 http://archives.postgresql.org/pgsql-hackers/2011-03/msg01391.php

 There are a couple of plausible ways to proceed here:

 1. Do nothing.

 2. When a smart shutdown is initiated, shut off synchronous
 replication.

 3. Accept new replication connections even when the system is
 undergoing a smart shutdown.

 I agree that #3 is impractical and #2 is a bad idea, which seems to
 leave us with #1 (unless anyone has a #4)?  This is probably just
 something we should figure is going to be one of the rough edges
 in the first release of sync rep.

That's kind of where my mind was headed too, although I was (probably
vainly) hoping for a better option.

 A #4 idea did just come to mind: once we realize that there are no
 working replication connections, automatically do a fast shutdown
 instead, ie, forcibly roll back those transactions that are never
 gonna complete.  Or at least have the postmaster bleat about it.
 But I'm not sure what it'd take to code that, and am also unsure
 that it's something to undertake at this stage of the cycle.

Well, you certainly can't do that.  By the time a transaction is
waiting for sync rep, it's too late to roll back; the commit record is
already, and necessarily, on disk.  But in theory we could notice that
all of the remaining backends are waiting for sync rep, and switch to
a fast shutdown.

Several people have suggested refinements for smart shutdown in
general, such as switching to fast shutdown after a certain number of
seconds, or having backends exit at the end of the current transaction
(or immediately if idle).  Such things would both make this problem
less irksome and increase the overall utility of smart shutdown
tremendously.  So maybe it's not worth expending too much effort on it
right now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-28 Thread Simon Riggs

On Thu, Mar 24, 2011 at 11:53 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Thu, Mar 24, 2011 at 8:34 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Thu, Mar 24, 2011 at 11:17 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Wed, Mar 23, 2011 at 5:53 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Do you still want to work up a patch for this?  If so, I can review.

 Sure. Will do.

 The attached patch allows standby servers to connect during smart shutdown
 in order to wake up backends waiting for sync rep.

 I think that is possibly OK, but the big problem is the lack of a
 clear set of comments about how the states are supposed to interact
 that allow these changes to be validated.

 That state isn't down to you, but I think we need that clear so this
 change is more obviously correct than it is now.

 src/backend/replication/README needs to be updated to make that clear?

Not sure where we'd put them.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-24 Thread Fujii Masao

On Wed, Mar 23, 2011 at 5:53 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Do you still want to work up a patch for this?  If so, I can review.

 Sure. Will do.

The attached patch allows standby servers to connect during smart shutdown
in order to wake up backends waiting for sync rep.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


allow_standby_to_connect_during_smart_shutdown_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-24 Thread Simon Riggs

On Thu, Mar 24, 2011 at 11:17 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Wed, Mar 23, 2011 at 5:53 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Do you still want to work up a patch for this?  If so, I can review.

 Sure. Will do.

 The attached patch allows standby servers to connect during smart shutdown
 in order to wake up backends waiting for sync rep.

I think that is possibly OK, but the big problem is the lack of a
clear set of comments about how the states are supposed to interact
that allow these changes to be validated.

That state isn't down to you, but I think we need that clear so this
change is more obviously correct than it is now.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-24 Thread Fujii Masao

On Thu, Mar 24, 2011 at 8:34 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Thu, Mar 24, 2011 at 11:17 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Wed, Mar 23, 2011 at 5:53 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Do you still want to work up a patch for this?  If so, I can review.

 Sure. Will do.

 The attached patch allows standby servers to connect during smart shutdown
 in order to wake up backends waiting for sync rep.

 I think that is possibly OK, but the big problem is the lack of a
 clear set of comments about how the states are supposed to interact
 that allow these changes to be validated.

 That state isn't down to you, but I think we need that clear so this
 change is more obviously correct than it is now.

src/backend/replication/README needs to be updated to make that clear?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sync rep fsync=off

2011-03-24 Thread Fujii Masao

On Sun, Mar 20, 2011 at 5:31 AM, Robert Haas robertmh...@gmail.com wrote:
 On Sat, Mar 19, 2011 at 11:34 AM, Grzegorz Jaskiewicz
 g...@pointblue.com.pl wrote:

 On 18 Mar 2011, at 21:12, Robert Haas wrote:

 While investigating Simon's complaint about my patch of a few days
 ago, I discovered that synchronous replication appears to slow to a
 crawl if fsync is turned off on the standby.

 I'm not sure why this is happening or what the right behavior is in
 this case, but I think some kind of adjustment is needed because the
 current behavior is quite surprising.
 We have few servers here running 8.3. And few weeks ago I had to populate 
 one database with quite a number of entries.
 I have script that does that, but it takes a while. I decided to turn fsck 
 to off. Oddly enough, the server started to crawl quite badly, load was very 
 high.
 That was 8.3 on rhel 5.4.

 My point is, it is sometimes bad combination of disks and controllers that 
 does that. Not necessarily software. fsync off doesn't always mean that 
 things are going to fly, it can cause it to expose hardware bottlenecks much 
 quicker.

 Well, it's possible.  But I think it'd be worth a look at the code to
 see if there's some bad interaction there between the no-fsync code
 and the sync-rep code - like, if we don't actually fsync, does the
 flush pointer ever get updated?

No, as far as I read the code. Disabling fsync increases the time taken
to close the WAL file?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-23 Thread Fujii Masao

On Sat, Mar 19, 2011 at 11:28 AM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Mar 18, 2011 at 10:25 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Mar 8, 2011 at 7:05 AM, Fujii Masao masao.fu...@gmail.com wrote:
 * Smart shutdown
 Smart shutdown should wait for all the waiting backends to be acked, and
 should not cause them to forcibly exit. But this leads shutdown to get stuck
 infinitely if there is no walsender at that time. To enable them to be acked
 even in that situation, we need to change postmaster so that it accepts the
 replication connection even during smart shutdown (until we reach
 PM_SHUTDOWN_2 state). Postmaster has already accepted the superuser
 connection to cancel backup during smart shutdown. So I don't think that
 the idea to accept the replication connection during smart shutdown is so
 ugly.

 * Fast shutdown
 I agree with you about fast shutdown. Fast shutdown should cause all the
 backends including waiting ones to exit immediately. At that time, the
 non-acked backend should not return the success, according to the
 definition of sync rep. So we need to change a backend so that it gets rid
 of itself from the waiting queue and exits before returning the success,
 when it receives SIGTERM. This change leads the waiting backends to
 do the same even when pg_terminate_backend is called. But since
 they've not been acked yet, it seems to be reasonable to prevent them
 from returning the COMMIT.

 Comments? I'll create the patch barring objection.

 The fast smart shutdown part of this problem has been addressed.  The

 Ugh.  I mean the fast shutdown, of course, not the fast smart
 shutdown.  Anyway, point is:

 fast shutdown now OK
 smart shutdown still not OK
 do you want to write a patch?

 :-)

 smart shutdown case still needs work, and I think the consensus was
 that your proposal above was the best way to go with it.

 Do you still want to work up a patch for this?  If so, I can review.

Sure. Will do.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-22 Thread Yeb Havinga


On 2011-03-21 23:58, Yeb Havinga wrote:
On Mon, Mar 21, 2011 at 7:51 PM, Yeb Havinga yebhavi...@gmail.com 
mailto:yebhavi...@gmail.com wrote:


On 2011-03-21 18:04, Robert Haas wrote:

On Mon, Mar 21, 2011 at 12:29 PM, Yeb
Havingayebhavi...@gmail.com mailto:yebhavi...@gmail.com
 wrote:

pgbench -i -s 50 test
Two runs of pgbench -c 10 -M prepared -T 600 test with 1
sync standby -
server configs etc were mailed upthread.

- performance as of commit
e148443ddd95cd29edf4cc1de6188eb9cee029c5

1158 and 1306 (avg 1232)

- performance as of current git master

1181 and 1280 (avg 1230,5)

- performance as of current git master with
sync-standbys-defined-rearrangement applied

1152 and 1269 (avg 1210,5)



IMO what these tests have shown is that there is no 20%
performance difference between the different versions. To
determine if there are differences, n should be a lot higher, or
perhaps a single one with a very large duration.


pgbench -T 3600:

sync-standbys-defined-rearrangement 1270 tps
current git master 1306 tps


Result of pgbench -T 3
sync-standbys-defined-rearrangement 1267 tps
current (or few days old) git master 1326 tps

So the patch eats 4,5% from git master's syncrep performance in my 
setup. Don't know how to measure it better than that.


--
Yeb Havinga
http://www.mgrid.net/
Mastering Medical Data

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-22 Thread Robert Haas

On Tue, Mar 22, 2011 at 3:25 PM, Yeb Havinga yebhavi...@gmail.com wrote:
 So the patch eats 4,5% from git master's syncrep performance in my setup.
 Don't know how to measure it better than that.

That's quite surprising, but I guess the way forward is clear: don't
apply that patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-21 Thread Yeb Havinga


On 2011-03-21 02:05, Robert Haas wrote:

On Sun, Mar 20, 2011 at 11:03 AM, Yeb Havingayebhavi...@gmail.com  wrote:

On 2011-03-20 05:44, Robert Haas wrote:

Hmm, I'm not going to be able to reproduce this here, and my test
setup didn't show a clear regression.  I can try beating on it some
more, but...  Any chance you could rerun your test with the latest
master-branch code, and perhaps also with the patch I proposed
upthread to remove a branch from the section protection by
SyncRepLock?  I can't really tell from reading the emails you linked
what was responsible for the slowdowns and speedups, and it is unclear
to me how much impact my recent changes actually had.

No problem. Could you tell me the name of the remove a branch from the
section protection by SyncRepLock ? patch, or perhaps a message-link?
Upthread I see sync-standbys-defined-rearrangement.patch but also two
sync-rep-wait-fixes.

Thanks!  The things I'd like to see compared are:

pgbench -i -s 50 test
Two runs of pgbench -c 10 -M prepared -T 600 test with 1 sync standby 
- server configs etc were mailed upthread.



- performance as of commit e148443ddd95cd29edf4cc1de6188eb9cee029c5

1158 and 1306 (avg 1232)

- performance as of current git master

1181 and 1280 (avg 1230,5)

- performance as of current git master with
sync-standbys-defined-rearrangement applied

1152 and 1269 (avg 1210,5)


--
Yeb Havinga
http://www.mgrid.net/
Mastering Medical Data


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-21 Thread Robert Haas

On Mon, Mar 21, 2011 at 12:29 PM, Yeb Havinga yebhavi...@gmail.com wrote:
 pgbench -i -s 50 test
 Two runs of pgbench -c 10 -M prepared -T 600 test with 1 sync standby -
 server configs etc were mailed upthread.

 - performance as of commit e148443ddd95cd29edf4cc1de6188eb9cee029c5

 1158 and 1306 (avg 1232)

 - performance as of current git master

 1181 and 1280 (avg 1230,5)

 - performance as of current git master with
 sync-standbys-defined-rearrangement applied

 1152 and 1269 (avg 1210,5)

Hmm, that doesn't appear to show the 20% regression Simon claimed
upthread.  That's good...  but I'm confused as to how you are getting
numbers this high at all without a BBU.  If every commit has to wait
for two consecutive fsyncs, cranking out 1200+ commits per second is a
lot.  Maybe it's just barely plausible if these are 15K drives and all
the commits are piggybacking on the fsyncs at top speed, but, man,
that's fast.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-21 Thread Yeb Havinga


On 2011-03-21 18:04, Robert Haas wrote:

On Mon, Mar 21, 2011 at 12:29 PM, Yeb Havingayebhavi...@gmail.com  wrote:

pgbench -i -s 50 test
Two runs of pgbench -c 10 -M prepared -T 600 test with 1 sync standby -
server configs etc were mailed upthread.


- performance as of commit e148443ddd95cd29edf4cc1de6188eb9cee029c5

1158 and 1306 (avg 1232)

- performance as of current git master

1181 and 1280 (avg 1230,5)

- performance as of current git master with
sync-standbys-defined-rearrangement applied

1152 and 1269 (avg 1210,5)


I ran another pgbench with this last setup, which gives it a 1240,33 
average:

tps = 1300.786386 (including connections establishing)
tps = 1300.844220 (excluding connections establishing)

IMO what these tests have shown is that there is no 20% performance 
difference between the different versions. To determine if there are 
differences, n should be a lot higher, or perhaps a single one with a 
very large duration.



Hmm, that doesn't appear to show the 20% regression Simon claimed
upthread.  That's good...  but I'm confused as to how you are getting
numbers this high at all without a BBU.


For the sake of testing syncrep, I put xfs in nobarrier mode on both 
master and standby:


/dev/sdc1 on /xlog type xfs (rw,noatime,nodiratime,nobarrier)
/dev/md11 on /archive type xfs 
(rw,noatime,nodiratime,nobarrier,logdev=/dev/sdc3)
/dev/md10 on /data type xfs 
(rw,noatime,nodiratime,nobarrier,logdev=/dev/sdc2)


--
Yeb Havinga
http://www.mgrid.net/
Mastering Medical Data


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-21 Thread Yeb Havinga

On Mon, Mar 21, 2011 at 7:51 PM, Yeb Havinga yebhavi...@gmail.com wrote:

 On 2011-03-21 18:04, Robert Haas wrote:

 On Mon, Mar 21, 2011 at 12:29 PM, Yeb Havingayebhavi...@gmail.com
  wrote:

 pgbench -i -s 50 test
 Two runs of pgbench -c 10 -M prepared -T 600 test with 1 sync standby -
 server configs etc were mailed upthread.

  - performance as of commit e148443ddd95cd29edf4cc1de6188eb9cee029c5

 1158 and 1306 (avg 1232)

 - performance as of current git master

 1181 and 1280 (avg 1230,5)

 - performance as of current git master with
 sync-standbys-defined-rearrangement applied

 1152 and 1269 (avg 1210,5)



 IMO what these tests have shown is that there is no 20% performance
 difference between the different versions. To determine if there are
 differences, n should be a lot higher, or perhaps a single one with a very
 large duration.


pgbench -T 3600:

sync-standbys-defined-rearrangement 1270 tps
current git master 1306 tps

-- 
Yeb Havinga
http://www.mgrid.net/
Mastering Medical Data

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-20 Thread Yeb Havinga


On 2011-03-20 05:44, Robert Haas wrote:


Hmm, I'm not going to be able to reproduce this here, and my test
setup didn't show a clear regression.  I can try beating on it some
more, but...  Any chance you could rerun your test with the latest
master-branch code, and perhaps also with the patch I proposed
upthread to remove a branch from the section protection by
SyncRepLock?  I can't really tell from reading the emails you linked
what was responsible for the slowdowns and speedups, and it is unclear
to me how much impact my recent changes actually had.
No problem. Could you tell me the name of the remove a branch from the 
section protection by SyncRepLock ? patch, or perhaps a message-link? 
Upthread I see sync-standbys-defined-rearrangement.patch but also two 
sync-rep-wait-fixes.


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-20 Thread Robert Haas

On Sun, Mar 20, 2011 at 11:03 AM, Yeb Havinga yebhavi...@gmail.com wrote:
 On 2011-03-20 05:44, Robert Haas wrote:

 Hmm, I'm not going to be able to reproduce this here, and my test
 setup didn't show a clear regression.  I can try beating on it some
 more, but...  Any chance you could rerun your test with the latest
 master-branch code, and perhaps also with the patch I proposed
 upthread to remove a branch from the section protection by
 SyncRepLock?  I can't really tell from reading the emails you linked
 what was responsible for the slowdowns and speedups, and it is unclear
 to me how much impact my recent changes actually had.

 No problem. Could you tell me the name of the remove a branch from the
 section protection by SyncRepLock ? patch, or perhaps a message-link?
 Upthread I see sync-standbys-defined-rearrangement.patch but also two
 sync-rep-wait-fixes.

Thanks!  The things I'd like to see compared are:

- performance as of commit e148443ddd95cd29edf4cc1de6188eb9cee029c5
- performance as of current git master
- performance as of current git master with
sync-standbys-defined-rearrangement applied

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-19 Thread Yeb Havinga


On 2011-03-18 18:25, Robert Haas wrote:

On Fri, Mar 18, 2011 at 1:15 PM, Simon Riggssi...@2ndquadrant.com  wrote:

On Thu, 2011-03-17 at 09:33 -0400, Robert Haas wrote:

Thanks for the review!

Lets have a look here...

You've added a test inside the lock to see if there is a standby, which
I took out for performance reasons. Maybe there's another way, I know
that code is fiddly.

You've also added back in the lock acquisition at wakeup with very
little justification, which was a major performance hit.

Together that's about a20% hit in performance in Yeb's tests. I think
you should spend a little time thinking how to retune that.

Ouch.  Do you have a link that describes his testing methodology?  I
will look at it.
Testing 'methodology' sounds a bit heavy. I tested a number of patch 
versions over time, with 30 second, hourly and nightly pgbench runs. The 
nightly more for durability/memory leak testing than tps numbers, since 
I gradually got the impression that pg_bench tests on syncrep setups 
somehow suffer less from big differences between tests.


postgres and recovery.conf I used to test v17 is listed here 
http://archives.postgresql.org/pgsql-hackers/2011-02/msg02364.php


After the tests on v17 I played a bit with small memory changes in the 
postgres.conf to see if the tps would go up. It went up a little but not 
enough to mention on the lists. All tests after v17 were done with the 
postgres.conf that I've copy pasted below.


I mentioned a performance regression in 
http://archives.postgresql.org/pgsql-hackers/2011-03/msg00298.php


And performance improvement in 
http://archives.postgresql.org/pgsql-hackers/2011-03/msg00464.php


All three servers (el cheapo consumer grade) the same: triple core 
AMD's, 16GB ECC, raid 0 over 2 SATA disks, XFS, nobarrier, separated 
data and xlog partitions. NB: there is no BBU controller in these 
servers. They don't run production stuff, it's just for testing. 1Gbit 
ethernet on non-blocking HP switch. No other load. ./configure 
--enable-depend --with-ossp-uuid --with-libxml --prefix=/mgrid/postgres


regards,
Yeb Havinga


Here's the postgresql.conf non-default I used after each new initdb. 
(synchronous_replication is off since it prevented me from adding a 
replication user, so after a initial basebackup I needed to turn it on)

#--
# CUSTOMIZED OPTIONS
#--

#custom_variable_classes = ''   # list of custom variable class 
names


#shared_preload_libraries = 'pg_stat_statements'
#custom_variable_classes = 'pg_stat_statements'
#pg_stat_statements.max = 100
#pg_stat_statements.track = all

syslog_ident = relay
autovacuum = off
#debug_print_parse = on
#debug_print_rewritten = on
#debug_print_plan = on
#debug_pretty_print = on
log_error_verbosity = verbose
log_min_messages = warning
log_min_error_statement = warning
listen_addresses = '*'# what IP address(es) to listen on;
search_path='\$user\, public, hl7'
archive_mode = on
archive_command = 'cd .'
checkpoint_completion_target = 0.9
checkpoint_segments = 16
default_statistics_target = 500
constraint_exclusion = on
max_connections = 100
maintenance_work_mem = 528MB
effective_cache_size = 5GB
work_mem = 144MB
wal_buffers = 8MB
shared_buffers = 528MB
wal_level = 'archive'
max_wal_senders = 10
wal_keep_segments = 100 # 1600MB (for production increase this)
synchronous_standby_names = 'standby1,standby2,standby3'
#synchronous_replication = on




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sync rep fsync=off

2011-03-19 Thread Grzegorz Jaskiewicz


On 18 Mar 2011, at 21:12, Robert Haas wrote:

 While investigating Simon's complaint about my patch of a few days
 ago, I discovered that synchronous replication appears to slow to a
 crawl if fsync is turned off on the standby.
 
 I'm not sure why this is happening or what the right behavior is in
 this case, but I think some kind of adjustment is needed because the
 current behavior is quite surprising.
We have few servers here running 8.3. And few weeks ago I had to populate one 
database with quite a number of entries. 
I have script that does that, but it takes a while. I decided to turn fsck to 
off. Oddly enough, the server started to crawl quite badly, load was very high. 
That was 8.3 on rhel 5.4. 

My point is, it is sometimes bad combination of disks and controllers that does 
that. Not necessarily software. fsync off doesn't always mean that things are 
going to fly, it can cause it to expose hardware bottlenecks much quicker. 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sync rep fsync=off

2011-03-19 Thread Robert Haas

On Sat, Mar 19, 2011 at 11:34 AM, Grzegorz Jaskiewicz
g...@pointblue.com.pl wrote:

 On 18 Mar 2011, at 21:12, Robert Haas wrote:

 While investigating Simon's complaint about my patch of a few days
 ago, I discovered that synchronous replication appears to slow to a
 crawl if fsync is turned off on the standby.

 I'm not sure why this is happening or what the right behavior is in
 this case, but I think some kind of adjustment is needed because the
 current behavior is quite surprising.
 We have few servers here running 8.3. And few weeks ago I had to populate one 
 database with quite a number of entries.
 I have script that does that, but it takes a while. I decided to turn fsck to 
 off. Oddly enough, the server started to crawl quite badly, load was very 
 high.
 That was 8.3 on rhel 5.4.

 My point is, it is sometimes bad combination of disks and controllers that 
 does that. Not necessarily software. fsync off doesn't always mean that 
 things are going to fly, it can cause it to expose hardware bottlenecks much 
 quicker.

Well, it's possible.  But I think it'd be worth a look at the code to
see if there's some bad interaction there between the no-fsync code
and the sync-rep code - like, if we don't actually fsync, does the
flush pointer ever get updated?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-19 Thread Robert Haas

On Sat, Mar 19, 2011 at 10:32 AM, Yeb Havinga yebhavi...@gmail.com wrote:
 Testing 'methodology' sounds a bit heavy. I tested a number of patch
 versions over time, with 30 second, hourly and nightly pgbench runs. The
 nightly more for durability/memory leak testing than tps numbers, since I
 gradually got the impression that pg_bench tests on syncrep setups somehow
 suffer less from big differences between tests.

 postgres and recovery.conf I used to test v17 is listed here
 http://archives.postgresql.org/pgsql-hackers/2011-02/msg02364.php

 After the tests on v17 I played a bit with small memory changes in the
 postgres.conf to see if the tps would go up. It went up a little but not
 enough to mention on the lists. All tests after v17 were done with the
 postgres.conf that I've copy pasted below.

Hmm, I'm not going to be able to reproduce this here, and my test
setup didn't show a clear regression.  I can try beating on it some
more, but...  Any chance you could rerun your test with the latest
master-branch code, and perhaps also with the patch I proposed
upthread to remove a branch from the section protection by
SyncRepLock?  I can't really tell from reading the emails you linked
what was responsible for the slowdowns and speedups, and it is unclear
to me how much impact my recent changes actually had.  I would have
expected the dominant cost to be waiting for the slave to complete its
fsync, with network overhead as runner-up.  And indeed this appears to
be the case on my ext4-based system.  I would not have expected
contention on SyncRepLock to be much of a factor.

It strikes me that if contention on SyncRepLock IS the dominating
factor, the whole approach to queue management is pretty well busted.
*Every* walsender takes SyncRepLock in exclusive mode on receipt of
*every* standby reply message.  That seems rather inefficient.  To
make matters worse, every time a walsender grabs SyncRepLock, it
redoes the whole computation of who the synchronous standby is from
scratch.  It strikes me that it ought to be possible to rejigger
things so that when a walsender exits, it signals any other walsenders
that exist to recheck whether they need to take over the role of
synchronous standby; then, each walsender needs to take the lock and
recheck only when it first connects, and each time it's signalled
thereafter.   When a walsender does decide that a change in the
synchronous standby is needed, it should store the ID of the new
walsender in shared memory, in a variable protected by SyncRepLock, so
that the current synchronous standby can notice the change with a
simple integer comparison.

It also strikes me that we ought to be able to rejigger things so that
the backends being woken up never need to touch shared memory at all -
i.e. eliminate syncRepState - thus reducing cache line contention.  If
WaitLatch() returns true, then the latch was set, presumably by
walsender.  My recent patch added a couple of places where the latch
can be set by the waiting process itself in response to an interrupt,
but that case can be handled by adding a backend-local flag variable
indicating whether we set the latch ourselves.  If we determine that
the latch is set and the did-it-myself flag is clear, we can conclude
that we were awakened by a walsender and call it good.  If the latch
is set and the did-it-myself flag is also set, then we were
interrupted, and we MAY also have been awakened by walsender at around
the same time.  We grab SyncRepLock to remove ourselves from the
queue, and if we find we're already removed, then we know we were
interrupted just as walsender awakened us; otherwise, it's a pure
interrupt.

It'd be interesting to see the results of some testing with
LWLOCK_STATS defined, to see whether SyncRepLock actually is contended
and if so to what degree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Robert Haas

On Thu, Mar 17, 2011 at 6:00 PM, Jeff Davis pg...@j-davis.com wrote:
 On Wed, 2011-03-16 at 13:35 -0400, Robert Haas wrote:
 2. If a query cancel interrupt is received (pg_cancel_backend or ^C),
 then cancel the sync rep wait and issue a warning before acknowledging
 the commit.

 When I saw this commit, I noticed that the WARNING doesn't have an
 errcode(). It seems like it should -- this is the kind of thing that the
 client is likely to care about, and may want to handle specially.

Should I invent ERRCODE_WARNING_TRANSACTION_NOT_REPLICATED?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Jeff Davis

On Fri, 2011-03-18 at 08:27 -0400, Robert Haas wrote:
 On Thu, Mar 17, 2011 at 6:00 PM, Jeff Davis pg...@j-davis.com wrote:
  On Wed, 2011-03-16 at 13:35 -0400, Robert Haas wrote:
  2. If a query cancel interrupt is received (pg_cancel_backend or ^C),
  then cancel the sync rep wait and issue a warning before acknowledging
  the commit.
 
  When I saw this commit, I noticed that the WARNING doesn't have an
  errcode(). It seems like it should -- this is the kind of thing that the
  client is likely to care about, and may want to handle specially.
 
 Should I invent ERRCODE_WARNING_TRANSACTION_NOT_REPLICATED?

I think it's reasonable to invent a new code here. Perhaps use the word
synchronous rather than replicated, though?

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Robert Haas

On Fri, Mar 18, 2011 at 10:17 AM, Jeff Davis pg...@j-davis.com wrote:
 On Fri, 2011-03-18 at 08:27 -0400, Robert Haas wrote:
 On Thu, Mar 17, 2011 at 6:00 PM, Jeff Davis pg...@j-davis.com wrote:
  On Wed, 2011-03-16 at 13:35 -0400, Robert Haas wrote:
  2. If a query cancel interrupt is received (pg_cancel_backend or ^C),
  then cancel the sync rep wait and issue a warning before acknowledging
  the commit.
 
  When I saw this commit, I noticed that the WARNING doesn't have an
  errcode(). It seems like it should -- this is the kind of thing that the
  client is likely to care about, and may want to handle specially.

 Should I invent ERRCODE_WARNING_TRANSACTION_NOT_REPLICATED?

 I think it's reasonable to invent a new code here. Perhaps use the word
 synchronous rather than replicated, though?

I think we have to, because it's definitely not the same situation
that someone would expect after ERRCODE_QUERY_CANCELLED.

But ERRCODE_WARNING_TRANSACTION_NOT_SYNCHRONOUS, which is what I read
you reply as suggesting, seems pretty wonky.  I wouldn't know what
that meant.  Another option might be:

ERRCODE_(WARNING_?)REPLICATION_WAIT_CANCELLED

...which might have something to recommend it.

Other thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Jeff Davis

On Fri, 2011-03-18 at 10:27 -0400, Robert Haas wrote:
 ERRCODE_(WARNING_?)REPLICATION_WAIT_CANCELLED
 
 ...which might have something to recommend it.

Works for me.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Heikki Linnakangas


On 18.03.2011 17:38, Jeff Davis wrote:

On Fri, 2011-03-18 at 10:27 -0400, Robert Haas wrote:

ERRCODE_(WARNING_?)REPLICATION_WAIT_CANCELLED

...which might have something to recommend it.


Works for me.


Yes, sounds reasonable. Without WARNING_, please.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Robert Haas

On Fri, Mar 18, 2011 at 11:58 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 18.03.2011 17:38, Jeff Davis wrote:

 On Fri, 2011-03-18 at 10:27 -0400, Robert Haas wrote:

 ERRCODE_(WARNING_?)REPLICATION_WAIT_CANCELLED

 ...which might have something to recommend it.

 Works for me.

 Yes, sounds reasonable. Without WARNING_, please.

The reason I included WARNING is because warnings have their own
section in errcodes.txt, and each errcode is marked E for error or W
for warning.  Since we CAN'T actually error out here, I thought it
might be more appropriate to make this a warning; and all of the
existing such codes contain WARNING.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Simon Riggs

On Thu, 2011-03-17 at 09:33 -0400, Robert Haas wrote:

 Thanks for the review!

Lets have a look here...

You've added a test inside the lock to see if there is a standby, which
I took out for performance reasons. Maybe there's another way, I know
that code is fiddly.

You've also added back in the lock acquisition at wakeup with very
little justification, which was a major performance hit.

Together that's about a 20% hit in performance in Yeb's tests. I think
you should spend a little time thinking how to retune that.


I see handling added for ProcDiePending and QueryCancelPending directly
into syncrep.c without any comments in postgres.c to indicate that you
bypass ProcessInterrupts() in some cases. That looks pretty hokey to me.

SyncRepUpdateSyncStandbysDefined() is added into walwriter, which means
waiters won't be released if we do a sighup during a fast shutdown,
since the walwriter gets killed as soon as that starts. I'm thinking
bgwriter should handle that now.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Robert Haas

On Fri, Mar 18, 2011 at 1:15 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Thu, 2011-03-17 at 09:33 -0400, Robert Haas wrote:
 Thanks for the review!

 Lets have a look here...

 You've added a test inside the lock to see if there is a standby, which
 I took out for performance reasons. Maybe there's another way, I know
 that code is fiddly.

 You've also added back in the lock acquisition at wakeup with very
 little justification, which was a major performance hit.

 Together that's about a 20% hit in performance in Yeb's tests. I think
 you should spend a little time thinking how to retune that.

Ouch.  Do you have a link that describes his testing methodology?  I
will look at it.

 I see handling added for ProcDiePending and QueryCancelPending directly
 into syncrep.c without any comments in postgres.c to indicate that you
 bypass ProcessInterrupts() in some cases. That looks pretty hokey to me.

I can add some comments.  Unfortunately, it's not feasible to call
ProcessInterrupts() directly from this point in the code - it causes a
database panic.

 SyncRepUpdateSyncStandbysDefined() is added into walwriter, which means
 waiters won't be released if we do a sighup during a fast shutdown,
 since the walwriter gets killed as soon as that starts. I'm thinking
 bgwriter should handle that now.

Hmm.  I was thinking that doing it in WAL writer would make it more
responsive, but since this is a fairly unlikely scenario, it's
probably not worth complicating the shutdown sequence to do it the way
I did.  I'll move it to bgwriter.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Alvaro Herrera

Excerpts from Robert Haas's message of vie mar 18 14:25:16 -0300 2011:
 On Fri, Mar 18, 2011 at 1:15 PM, Simon Riggs si...@2ndquadrant.com wrote:

  SyncRepUpdateSyncStandbysDefined() is added into walwriter, which means
  waiters won't be released if we do a sighup during a fast shutdown,
  since the walwriter gets killed as soon as that starts. I'm thinking
  bgwriter should handle that now.
 
 Hmm.  I was thinking that doing it in WAL writer would make it more
 responsive, but since this is a fairly unlikely scenario, it's
 probably not worth complicating the shutdown sequence to do it the way
 I did.  I'll move it to bgwriter.

Can't they both do it?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Robert Haas

On Fri, Mar 18, 2011 at 2:55 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Excerpts from Robert Haas's message of vie mar 18 14:25:16 -0300 2011:
 On Fri, Mar 18, 2011 at 1:15 PM, Simon Riggs si...@2ndquadrant.com wrote:

  SyncRepUpdateSyncStandbysDefined() is added into walwriter, which means
  waiters won't be released if we do a sighup during a fast shutdown,
  since the walwriter gets killed as soon as that starts. I'm thinking
  bgwriter should handle that now.

 Hmm.  I was thinking that doing it in WAL writer would make it more
 responsive, but since this is a fairly unlikely scenario, it's
 probably not worth complicating the shutdown sequence to do it the way
 I did.  I'll move it to bgwriter.

 Can't they both do it?

Yeah, but it seems fairly pointless.  In retrospect, I probably should
have done it the way Simon is proposing to begin with.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Robert Haas

Responding to this again, somewhat out of order...

On Fri, Mar 18, 2011 at 1:15 PM, Simon Riggs si...@2ndquadrant.com wrote:
 Together that's about a 20% hit in performance in Yeb's tests. I think
 you should spend a little time thinking how to retune that.

I've spent some time playing around with pgbench and so far I haven't
been able to reliably reproduce this, which is not to say I don't
believe the effect is real, but rather that either I'm doing something
completely wrong, or it requires some specific setup to measure that
doesn't match my environment, or that it's somewhat finicky to
reproduce, or some combination of the above.

 You've added a test inside the lock to see if there is a standby, which
 I took out for performance reasons. Maybe there's another way, I know
 that code is fiddly.

It seems pretty easy to remove the branch from the test at the top of
the function by just rearranging things a bit.  Patch attached; does
this help?

 You've also added back in the lock acquisition at wakeup with very
 little justification, which was a major performance hit.

I have a very difficult time believing this is a real problem.  That
extra lock acquisition and release only happens if WaitLatchOrSocket()
returns but MyProc-syncRepState still appears to be SYNC_REP_WAITING.
 That should only happen if the latch wait hits the timeout (which
takes 60 s!) or if the precise memory ordering problem that was put in
to fix is occurring (in which case it should dramatically *improve*
performance, by avoiding an extra 60 s wait).  I stuck in a call to
elog(LOG, got here) and it didn't fire even once in a 5-minute
pgbench test (~45k transactions).  So I have a hard time crediting
this for any performance problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


sync-standbys-defined-rearrangement.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Robert Haas

On Tue, Mar 8, 2011 at 7:05 AM, Fujii Masao masao.fu...@gmail.com wrote:
 * Smart shutdown
 Smart shutdown should wait for all the waiting backends to be acked, and
 should not cause them to forcibly exit. But this leads shutdown to get stuck
 infinitely if there is no walsender at that time. To enable them to be acked
 even in that situation, we need to change postmaster so that it accepts the
 replication connection even during smart shutdown (until we reach
 PM_SHUTDOWN_2 state). Postmaster has already accepted the superuser
 connection to cancel backup during smart shutdown. So I don't think that
 the idea to accept the replication connection during smart shutdown is so
 ugly.

 * Fast shutdown
 I agree with you about fast shutdown. Fast shutdown should cause all the
 backends including waiting ones to exit immediately. At that time, the
 non-acked backend should not return the success, according to the
 definition of sync rep. So we need to change a backend so that it gets rid
 of itself from the waiting queue and exits before returning the success,
 when it receives SIGTERM. This change leads the waiting backends to
 do the same even when pg_terminate_backend is called. But since
 they've not been acked yet, it seems to be reasonable to prevent them
 from returning the COMMIT.

 Comments? I'll create the patch barring objection.

The fast smart shutdown part of this problem has been addressed.  The
smart shutdown case still needs work, and I think the consensus was
that your proposal above was the best way to go with it.

Do you still want to work up a patch for this?  If so, I can review.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-18 Thread Robert Haas

On Fri, Mar 18, 2011 at 10:25 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Mar 8, 2011 at 7:05 AM, Fujii Masao masao.fu...@gmail.com wrote:
 * Smart shutdown
 Smart shutdown should wait for all the waiting backends to be acked, and
 should not cause them to forcibly exit. But this leads shutdown to get stuck
 infinitely if there is no walsender at that time. To enable them to be acked
 even in that situation, we need to change postmaster so that it accepts the
 replication connection even during smart shutdown (until we reach
 PM_SHUTDOWN_2 state). Postmaster has already accepted the superuser
 connection to cancel backup during smart shutdown. So I don't think that
 the idea to accept the replication connection during smart shutdown is so
 ugly.

 * Fast shutdown
 I agree with you about fast shutdown. Fast shutdown should cause all the
 backends including waiting ones to exit immediately. At that time, the
 non-acked backend should not return the success, according to the
 definition of sync rep. So we need to change a backend so that it gets rid
 of itself from the waiting queue and exits before returning the success,
 when it receives SIGTERM. This change leads the waiting backends to
 do the same even when pg_terminate_backend is called. But since
 they've not been acked yet, it seems to be reasonable to prevent them
 from returning the COMMIT.

 Comments? I'll create the patch barring objection.

 The fast smart shutdown part of this problem has been addressed.  The

Ugh.  I mean the fast shutdown, of course, not the fast smart
shutdown.  Anyway, point is:

fast shutdown now OK
smart shutdown still not OK
do you want to write a patch?

:-)

 smart shutdown case still needs work, and I think the consensus was
 that your proposal above was the best way to go with it.

 Do you still want to work up a patch for this?  If so, I can review.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-17 Thread Fujii Masao

On Thu, Mar 17, 2011 at 2:35 AM, Robert Haas robertmh...@gmail.com wrote:
 1. If a die interrupt is received (pg_terminate_backend or fast
 shutdown), then terminate the sync rep wait and arrange for the
 connection to be closed without acknowledging the commit (but do send
 a warning message back).  The commit still happened, though, so other
 transactions will see its effects.  This is unavoidable unless we're
 willing to either ignore attempts to terminate a backend waiting for
 sync rep, or panic the system when it happens, and I don't think
 either of those is appropriate.

OK.

 2. If a query cancel interrupt is received (pg_cancel_backend or ^C),
 then cancel the sync rep wait and issue a warning before acknowledging
 the commit.  Again, the alternative is to either ignore the cancel or
 panic, neither of which I believe to be what users will want.

OK.

 3. If synchronous_standby_names is changed to '' by editing
 postgresql.conf and issuing pg_ctl reload, then cancel all waits in
 progress and wake everybody up.  As I mentioned before, reloading the
 config file from within the waiting backend (which can't safely throw
 an error) seems risky,

AFAIR, ProcessConfigFile() doesn't throw an error. So I don't think
that's so risky. But, as you said in another thread, reading config file
at that point is inconsistent, I agree. And it seems better to leave
background process to wake up backends.

 so what I did instead is made WAL writer
 responsible for handling this.  Nobody's allowed to wait for sync rep
 unless a global shared memory flag is set, and the WAL writer process
 is responsible for setting and clearing this flag when the config file
 is reloaded.  This has basically no performance cost; WAL writer only
 ever does any extra work at all with this code when it receives a
 SIGHUP, and even then the work is trivial except in the case where
 synchronous_standby_names has changed from empty to non-empty or visca
 versa.  The advantage of putting this in WAL writer rather than, say,
 bgwriter is that WAL writer doesn't have nearly as many jobs to do and
 they don't involve nearly as much I/O, so the chances of a long delay
 due to the process being busy are much less.

This occurs to me; we should ensure that, in shutdown case, walwriter
should exit after all the backends have gone out? I'm not sure if it's worth
thinking of the case, but what if synchronous_standby_names is unset
and config file is reloaded after smart shutdown is requested? In this
case, the reload cannot wake up the waiting backends since walwriter
has already exited. This behavior looks a bit inconsistent.

 4. Remove the SYNC_REP_MUST_DISCONNECT state, which actually does
 absolutely nothing right now, despite what the name would seem to
 imply.  In particular, it doesn't arrange for any sort of disconnect.
 This patch does arrange for that, but not using this mechanism.

OK.

 5. The existing code relies on being able to read MyProc-syncRepState
 without holding the lock, even while a WAL sender must be updating it
 in another process.  I'm not 100% sure this is safe on a
 multi-processor machine with weak memory ordering.  In practice, the
 chances of something going wrong here seem extremely small.  You'd
 need something like this: a WAL sender updates MyProc-syncRepState
 just after the wait timeout expires and before the latch is reset, but
 the regular backend fails to see the state change due to
 memory-ordering effects and drops through the loop, waiting another 60
 s, and then finally wakes up and completes the wait (but a minute
 later than expected).  That seems vanishingly unlikely but it's also
 simple to protect against, so I did.

In the patch, in order to read the latest value, you take a light-weight lock.
But I wonder why taking a lock can ensure that the value is up-to-date.

 Review appreciated.

Thanks! Here are some comments:

+ * WAL writer calls this as needed to update the shared sync_standbys_needed

Typo: s/sync_standbys_needed/sync_standbys_defined

+ * we exit normally, or SYNC_REP_MUST_DISCONNECT in abnormal cases.

Typo: the reference to SYNC_REP_MUST_DISCONNECT is not required.

+* So in this case we issue a NOTICE (which some clients may

Typo: s/NOTICE/WARNING

+   if (ProcDiePending)
+   {
+   ereport(WARNING,
+   (errcode(ERRCODE_ADMIN_SHUTDOWN),
+errmsg(canceling the wait for 
replication and terminating
connection due to administrator command),
+errdetail(The transaction has already 
been committed locally
but might have not been replicated to the standby.)));
+   whereToSendOutput = DestNone;
+   LWLockAcquire(SyncRepLock, LW_EXCLUSIVE);
+   MyProc-syncRepState = SYNC_REP_WAIT_COMPLETE;
+   SHMQueueDelete((MyProc-syncRepLinks));

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-17 Thread Heikki Linnakangas


On 16.03.2011 19:35, Robert Haas wrote:

3. If synchronous_standby_names is changed to '' by editing
postgresql.conf and issuing pg_ctl reload, then cancel all waits in
progress and wake everybody up.  As I mentioned before, reloading the
config file from within the waiting backend (which can't safely throw
an error) seems risky, so what I did instead is made WAL writer
responsible for handling this.  Nobody's allowed to wait for sync rep
unless a global shared memory flag is set, and the WAL writer process
is responsible for setting and clearing this flag when the config file
is reloaded.  This has basically no performance cost; WAL writer only
ever does any extra work at all with this code when it receives a
SIGHUP, and even then the work is trivial except in the case where
synchronous_standby_names has changed from empty to non-empty or visca
versa.  The advantage of putting this in WAL writer rather than, say,
bgwriter is that WAL writer doesn't have nearly as many jobs to do and
they don't involve nearly as much I/O, so the chances of a long delay
due to the process being busy are much less.


Hmm, so setting synchronous_standby_names to '' takes effect 
immediately, but other changes to it don't apply to already-blocked 
commits. That seems a bit inconsistent. Perhaps walwriter should store 
the parsed list of standby-names in shared memory, not just a boolean.


+1 otherwise.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-17 Thread Robert Haas

On Thu, Mar 17, 2011 at 2:08 AM, Fujii Masao masao.fu...@gmail.com wrote:
 This occurs to me; we should ensure that, in shutdown case, walwriter
 should exit after all the backends have gone out? I'm not sure if it's worth
 thinking of the case, but what if synchronous_standby_names is unset
 and config file is reloaded after smart shutdown is requested? In this
 case, the reload cannot wake up the waiting backends since walwriter
 has already exited. This behavior looks a bit inconsistent.

I agree we need to fix smart shutdown.  I think that's going to have
to be a separate patch, however; it's got more problems than this
patch can fix without expanding into a monster.

 In the patch, in order to read the latest value, you take a light-weight lock.
 But I wonder why taking a lock can ensure that the value is up-to-date.

A lock acquisition acts as a memory sequence point.

 + * WAL writer calls this as needed to update the shared sync_standbys_needed

 Typo: s/sync_standbys_needed/sync_standbys_defined

Fixed.

 + * we exit normally, or SYNC_REP_MUST_DISCONNECT in abnormal cases.

 Typo: the reference to SYNC_REP_MUST_DISCONNECT is not required.

Fixed.

 +                * So in this case we issue a NOTICE (which some clients may

 Typo: s/NOTICE/WARNING

Fixed.

 +               if (ProcDiePending)
 +               {
 +                       ereport(WARNING,
 +                                       (errcode(ERRCODE_ADMIN_SHUTDOWN),
 +                                        errmsg(canceling the wait for 
 replication and terminating
 connection due to administrator command),
 +                                        errdetail(The transaction has 
 already been committed locally
 but might have not been replicated to the standby.)));
 +                       whereToSendOutput = DestNone;
 +                       LWLockAcquire(SyncRepLock, LW_EXCLUSIVE);
 +                       MyProc-syncRepState = SYNC_REP_WAIT_COMPLETE;
 +                       SHMQueueDelete((MyProc-syncRepLinks));

 SHMQueueIsDetached() should be checked before calling SHMQueueDelete().
 Walsender can already delete the backend from the queue before reaching here.

Fixed.  But come to think of it, doesn't this mean
SyncRepCleanupAtProcExit() needs to repeat the test after acquiring
the lock?

 +               if (QueryCancelPending)
 +               {
 +                       QueryCancelPending = false;
 +                       ereport(WARNING,
 +                                       (errmsg(canceling wait for 
 synchronous replication due to user request),
 +                                        errdetail(The transaction has 
 committed locally, but may not
 have replicated to the standby.)));
 +                       LWLockAcquire(SyncRepLock, LW_EXCLUSIVE);
 +                       MyProc-syncRepState = SYNC_REP_WAIT_COMPLETE;
 +                       SHMQueueDelete((MyProc-syncRepLinks));
 +                       LWLockRelease(SyncRepLock);

 Same as above.

Fixed.

 +               if (!PostmasterIsAlive(true))
 +               {
 +                       whereToSendOutput = DestNone;
 +                       proc_exit(1);

 proc_exit() should not be called at that point because it leads PANIC.

Fixed, although I'm not sure it matters.

 I think that it's better to check ProcDiePending, QueryCancelPending
 and PostmasterIsAlive *before* waiting on the latch, not after. Because
 those events can occur before reaching there, and it's not worth waiting
 for 60 seconds to detect them.

Not necessary.  Inspired by one of your earlier patches, I made die()
and StatementCancelHandler() set the latch.  We could still wait up to
60 s to detect postmaster death, but that's a very rare situation and
it's not worth slowing down the common case for it, especially since
there is a race condition no matter what.

Thanks for the review!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


sync-rep-wait-fixes-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-17 Thread Robert Haas

On Thu, Mar 17, 2011 at 8:24 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Hmm, so setting synchronous_standby_names to '' takes effect immediately,
 but other changes to it don't apply to already-blocked commits. That seems a
 bit inconsistent. Perhaps walwriter should store the parsed list of
 standby-names in shared memory, not just a boolean.

I don't think this is necessary.  In general, the current or potential
WAL sender processes are responsible for working out among themselves
whose job it is to release waiters, and doing it.  As long as
synchronous_standby_names is non-empty, then either (1) there are one
or more standbys connected that can take on the role of synchronous
standby, and whoever does will release waiters or (2) there are no
standbys connected that can take on the role of synchronous standbys,
in which case no waiters should be released until one connects.  But
when synchronous_standby_names becomes completely empty, that doesn't
mean wait until a standby connects whose application name is in the
empty set and make him the synchronous standby but rather
synchronous replication is administratively disabled, don't wait in
the first place.  So we just need a Boolean flag.

 +1 otherwise.

Thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-17 Thread Jeff Davis

On Wed, 2011-03-16 at 13:35 -0400, Robert Haas wrote:
 2. If a query cancel interrupt is received (pg_cancel_backend or ^C),
 then cancel the sync rep wait and issue a warning before acknowledging
 the commit.

When I saw this commit, I noticed that the WARNING doesn't have an
errcode(). It seems like it should -- this is the kind of thing that the
client is likely to care about, and may want to handle specially. 

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-16 Thread Simon Riggs

On Tue, 2011-03-15 at 22:07 -0400, Robert Haas wrote:
 On Wed, Mar 9, 2011 at 11:11 PM, Fujii Masao masao.fu...@gmail.com wrote:
  Same as above. I think that it's more problematic to leave the code
  as it is. Because smart/fast shutdown can make the server get stuck
  until immediate shutdown is requested.
 
 I agree that the current state of affairs is a problem.  However,
 after looking through the code somewhat carefully, it looks a bit
 difficult to fix.  Suppose that backend A is waiting for sync rep.  A
 fast shutdown is performed.  Right now, backend A shrugs its shoulders
 and does nothing.  Not good.  But suppose we change it so that backend
 A closes the connection and exits without either confirming the commit
 or throwing ERROR/FATAL.  That seems like correct behavior, since, if
 we weren't using sync rep, the client would have to interpret that as
 indicating that the connection denied in mid-COMMIT, and mustn't
 assume anything about the state of the transaction.  So far so good.
 
 The problem is that there may be another backend B waiting on a lock
 held by A.  If backend A exits cleanly (without a PANIC), it will
 remove itself from the ProcArray and release locks.  That wakes up A,
 which can now go do its thing.  If the operating system is a bit on
 the slow side delivering the signal to B, then the client to which B
 is connected might manage to see a database state that shows the
 transaction previous running in A as committed, even though that
 transaction wasn't committed.  That would stink, because the whole
 point of having A hold onto locks until the standby ack'd the commit
 was that no other transaction would see it as committed until it was
 replicated.
 
 This is a pretty unlikely race condition in practice but people who
 are running sync rep are intending precisely to guard against unlikely
 failure scenarios.
 
 The only idea I have for allowing fast shutdown to still be fast, even
 when sync rep is involved, is to shut down the system in two phases.
 The postmaster would need to stop accepting new connections, and first
 kill off all the backends that aren't waiting for sync rep.  Then,
 once all remaining backends are waiting for sync rep, we can have them
 proceed as above: close the connection without acking the commit or
 throwing ERROR/FATAL, and exit.  That's pretty complicated, especially
 given the rule that the postmaster mustn't touch shared memory, but I
 don't see any alternative.  


 We could just not allow fast shutdown, as
 now, but I think that's worse.

Please explain why not allowing fast shutdown makes it worse?

For me, I'd rather not support a whole bunch of dubious code, just to
allow you to type -m fast when you can already type -m immediate.

What extra capability are we actually delivering by doing that??
The risk of introducing a bug and thereby losing data far outweighs the
rather dubious benefit.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-16 Thread Robert Haas

On Wed, Mar 16, 2011 at 1:43 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Wed, Mar 16, 2011 at 11:07 AM, Robert Haas robertmh...@gmail.com wrote:
 The problem is that there may be another backend B waiting on a lock
 held by A.  If backend A exits cleanly (without a PANIC), it will
 remove itself from the ProcArray and release locks.  That wakes up A,
 which can now go do its thing.  If the operating system is a bit on
 the slow side delivering the signal to B, then the client to which B
 is connected might manage to see a database state that shows the
 transaction previous running in A as committed, even though that
 transaction wasn't committed.  That would stink, because the whole
 point of having A hold onto locks until the standby ack'd the commit
 was that no other transaction would see it as committed until it was
 replicated.

 The lock can be released also when the transaction running in A is
 rollbacked. So I could not understand why the client wrongly always
 see the transaction as commtted even though it's not committed.

The transaction IS committed, but only locally.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-16 Thread Robert Haas

On Wed, Mar 16, 2011 at 4:51 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Tue, 2011-03-15 at 22:07 -0400, Robert Haas wrote:
 On Wed, Mar 9, 2011 at 11:11 PM, Fujii Masao masao.fu...@gmail.com wrote:
  Same as above. I think that it's more problematic to leave the code
  as it is. Because smart/fast shutdown can make the server get stuck
  until immediate shutdown is requested.

 I agree that the current state of affairs is a problem.  However,
 after looking through the code somewhat carefully, it looks a bit
 difficult to fix.  Suppose that backend A is waiting for sync rep.  A
 fast shutdown is performed.  Right now, backend A shrugs its shoulders
 and does nothing.  Not good.  But suppose we change it so that backend
 A closes the connection and exits without either confirming the commit
 or throwing ERROR/FATAL.  That seems like correct behavior, since, if
 we weren't using sync rep, the client would have to interpret that as
 indicating that the connection denied in mid-COMMIT, and mustn't
 assume anything about the state of the transaction.  So far so good.

 The problem is that there may be another backend B waiting on a lock
 held by A.  If backend A exits cleanly (without a PANIC), it will
 remove itself from the ProcArray and release locks.  That wakes up A,
 which can now go do its thing.  If the operating system is a bit on
 the slow side delivering the signal to B, then the client to which B
 is connected might manage to see a database state that shows the
 transaction previous running in A as committed, even though that
 transaction wasn't committed.  That would stink, because the whole
 point of having A hold onto locks until the standby ack'd the commit
 was that no other transaction would see it as committed until it was
 replicated.

 This is a pretty unlikely race condition in practice but people who
 are running sync rep are intending precisely to guard against unlikely
 failure scenarios.

 The only idea I have for allowing fast shutdown to still be fast, even
 when sync rep is involved, is to shut down the system in two phases.
 The postmaster would need to stop accepting new connections, and first
 kill off all the backends that aren't waiting for sync rep.  Then,
 once all remaining backends are waiting for sync rep, we can have them
 proceed as above: close the connection without acking the commit or
 throwing ERROR/FATAL, and exit.  That's pretty complicated, especially
 given the rule that the postmaster mustn't touch shared memory, but I
 don't see any alternative.


 We could just not allow fast shutdown, as
 now, but I think that's worse.

 Please explain why not allowing fast shutdown makes it worse?

 For me, I'd rather not support a whole bunch of dubious code, just to
 allow you to type -m fast when you can already type -m immediate.

 What extra capability are we actually delivering by doing that??
 The risk of introducing a bug and thereby losing data far outweighs the
 rather dubious benefit.

Well, my belief is that when users ask the database to shut down, they
want it to work.  If I'm the only one who thinks that, then whatever.
But I firmly believe we'll get bug reports about this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-16 Thread Robert Haas

On Wed, Mar 16, 2011 at 7:39 AM, Robert Haas robertmh...@gmail.com wrote:
 The only idea I have for allowing fast shutdown to still be fast, even
 when sync rep is involved, is to shut down the system in two phases.
 The postmaster would need to stop accepting new connections, and first
 kill off all the backends that aren't waiting for sync rep.  Then,
 once all remaining backends are waiting for sync rep, we can have them
 proceed as above: close the connection without acking the commit or
 throwing ERROR/FATAL, and exit.  That's pretty complicated, especially
 given the rule that the postmaster mustn't touch shared memory, but I
 don't see any alternative.

 What extra capability are we actually delivering by doing that??
 The risk of introducing a bug and thereby losing data far outweighs the
 rather dubious benefit.

 Well, my belief is that when users ask the database to shut down, they
 want it to work.  If I'm the only one who thinks that, then whatever.
 But I firmly believe we'll get bug reports about this.

On further review, the approach proposed above doesn't really work,
because a backend can get a SIGTERM either because the system is doing
a fast shutdown or because a user has issued
pg_terminate_backend(PID); and in the latter case we have to continue
letting in connections.

As of right now, synchronous replication continues to wait even when:

- someone tries to perform a fast shutdown
- someone tries to kill the backend using pg_terminate_backend()
- someone attempts to cancel the query using pg_cancel_backend() or by
pressing control-C in, for example, psql
- someone attempts to shut off synchronous replication by changing
synchronous_standby_names in postgresql.conf and issuing pg_ctl reload

We've worked pretty hard to ensure that things like query cancel and
shutdown work quickly and reliably, and I don't think we want to make
synchronous replication the one part of the system that departs from
that general principle.

So, patch attached.  This patch arranges to do the following things:

1. If a die interrupt is received (pg_terminate_backend or fast
shutdown), then terminate the sync rep wait and arrange for the
connection to be closed without acknowledging the commit (but do send
a warning message back).  The commit still happened, though, so other
transactions will see its effects.  This is unavoidable unless we're
willing to either ignore attempts to terminate a backend waiting for
sync rep, or panic the system when it happens, and I don't think
either of those is appropriate.

2. If a query cancel interrupt is received (pg_cancel_backend or ^C),
then cancel the sync rep wait and issue a warning before acknowledging
the commit.  Again, the alternative is to either ignore the cancel or
panic, neither of which I believe to be what users will want.

3. If synchronous_standby_names is changed to '' by editing
postgresql.conf and issuing pg_ctl reload, then cancel all waits in
progress and wake everybody up.  As I mentioned before, reloading the
config file from within the waiting backend (which can't safely throw
an error) seems risky, so what I did instead is made WAL writer
responsible for handling this.  Nobody's allowed to wait for sync rep
unless a global shared memory flag is set, and the WAL writer process
is responsible for setting and clearing this flag when the config file
is reloaded.  This has basically no performance cost; WAL writer only
ever does any extra work at all with this code when it receives a
SIGHUP, and even then the work is trivial except in the case where
synchronous_standby_names has changed from empty to non-empty or visca
versa.  The advantage of putting this in WAL writer rather than, say,
bgwriter is that WAL writer doesn't have nearly as many jobs to do and
they don't involve nearly as much I/O, so the chances of a long delay
due to the process being busy are much less.

4. Remove the SYNC_REP_MUST_DISCONNECT state, which actually does
absolutely nothing right now, despite what the name would seem to
imply.  In particular, it doesn't arrange for any sort of disconnect.
This patch does arrange for that, but not using this mechanism.

5. The existing code relies on being able to read MyProc-syncRepState
without holding the lock, even while a WAL sender must be updating it
in another process.  I'm not 100% sure this is safe on a
multi-processor machine with weak memory ordering.  In practice, the
chances of something going wrong here seem extremely small.  You'd
need something like this: a WAL sender updates MyProc-syncRepState
just after the wait timeout expires and before the latch is reset, but
the regular backend fails to see the state change due to
memory-ordering effects and drops through the loop, waiting another 60
s, and then finally wakes up and completes the wait (but a minute
later than expected).  That seems vanishingly unlikely but it's also
simple to protect against, so I did.

Review appreciated.

Thanks,

-- 
Robert Haas
EnterpriseDB:

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-16 Thread Dimitri Fontaine

Robert Haas robertmh...@gmail.com writes:
 1. If a die interrupt is received (pg_terminate_backend or fast
 shutdown), then terminate the sync rep wait and arrange for the
 connection to be closed without acknowledging the commit (but do send
 a warning message back).  The commit still happened, though, so other
 transactions will see its effects.  This is unavoidable unless we're
 willing to either ignore attempts to terminate a backend waiting for
 sync rep, or panic the system when it happens, and I don't think
 either of those is appropriate.

Is it possible to force the standby out here, so that logs show that
there was something going on wrt replication?

 2. If a query cancel interrupt is received (pg_cancel_backend or ^C),
 then cancel the sync rep wait and issue a warning before acknowledging
 the commit.  Again, the alternative is to either ignore the cancel or
 panic, neither of which I believe to be what users will want.

Or force the standby to disconnect.

In both those cases what we have is a situation were either we can't
satisfy the user request or we can't continue to offer sync rep.  You're
saying that we have to satisfy the user's query, so I say kick off sync
rep or it does not make any sense.

 3. If synchronous_standby_names is changed to '' by editing
 postgresql.conf and issuing pg_ctl reload, then cancel all waits in
 progress and wake everybody up.  As I mentioned before, reloading the

Ok.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-16 Thread Robert Haas

On Wed, Mar 16, 2011 at 6:23 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
 1. If a die interrupt is received (pg_terminate_backend or fast
 shutdown), then terminate the sync rep wait and arrange for the
 connection to be closed without acknowledging the commit (but do send
 a warning message back).  The commit still happened, though, so other
 transactions will see its effects.  This is unavoidable unless we're
 willing to either ignore attempts to terminate a backend waiting for
 sync rep, or panic the system when it happens, and I don't think
 either of those is appropriate.

 Is it possible to force the standby out here, so that logs show that
 there was something going on wrt replication?

That's an interesting idea, but I think it might be too much spooky
action at a distance.  I think we should look at getting Fujii Masao's
replication_timeout patch committed; that seems like the right way to
kick out unresponsive standbys.  Another problem with doing it here is
that any ERROR will turn into a PANIC, which rules out doing anything
very complicated.  Also note that we can (and do) log a WARNING, which
I think answers your concern about having something in the logs wrt
replication.

A further point is that even if we could kick out the standby, it'd
presumably reconnect after the usual 2 s interval, so it doesn't seem
like it really accomplishes much.  We can't just unilaterally decide
that it is no longer allowed to be a sync standby ever again; that's
controlled by postgresql.conf.

I think the most important part of all this is that it is logged.
Anyone who is running synchronous replication should also be doing
careful monitoring; if not, shame on them, because if your data is
important enough that you need synchronous replication, it's surely
important enough to watch the logs.  If you don't, all sorts of bad
things can happen to your data (either related to sync rep, or
otherwise) and you'll have no idea until it's far too late.

 2. If a query cancel interrupt is received (pg_cancel_backend or ^C),
 then cancel the sync rep wait and issue a warning before acknowledging
 the commit.  Again, the alternative is to either ignore the cancel or
 panic, neither of which I believe to be what users will want.

 Or force the standby to disconnect.

 In both those cases what we have is a situation were either we can't
 satisfy the user request or we can't continue to offer sync rep.  You're
 saying that we have to satisfy the user's query, so I say kick off sync
 rep or it does not make any sense.

Same considerations here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-16 Thread Aidan Van Dyk

On Wed, Mar 16, 2011 at 8:30 PM, Robert Haas robertmh...@gmail.com wrote:

 I think the most important part of all this is that it is logged.
 Anyone who is running synchronous replication should also be doing
 careful monitoring; if not, shame on them, because if your data is
 important enough that you need synchronous replication, it's surely
 important enough to watch the logs.  If you don't, all sorts of bad
 things can happen to your data (either related to sync rep, or
 otherwise) and you'll have no idea until it's far too late.

+

If your data is that important, your logs/monitoring are *equally*
important, because they are what give you confidence your data is as
safe as you think it is...


-- 
Aidan Van Dyk                                             Create like a god,
ai...@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-15 Thread Robert Haas

On Wed, Mar 9, 2011 at 11:11 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Same as above. I think that it's more problematic to leave the code
 as it is. Because smart/fast shutdown can make the server get stuck
 until immediate shutdown is requested.

I agree that the current state of affairs is a problem.  However,
after looking through the code somewhat carefully, it looks a bit
difficult to fix.  Suppose that backend A is waiting for sync rep.  A
fast shutdown is performed.  Right now, backend A shrugs its shoulders
and does nothing.  Not good.  But suppose we change it so that backend
A closes the connection and exits without either confirming the commit
or throwing ERROR/FATAL.  That seems like correct behavior, since, if
we weren't using sync rep, the client would have to interpret that as
indicating that the connection denied in mid-COMMIT, and mustn't
assume anything about the state of the transaction.  So far so good.

The problem is that there may be another backend B waiting on a lock
held by A.  If backend A exits cleanly (without a PANIC), it will
remove itself from the ProcArray and release locks.  That wakes up A,
which can now go do its thing.  If the operating system is a bit on
the slow side delivering the signal to B, then the client to which B
is connected might manage to see a database state that shows the
transaction previous running in A as committed, even though that
transaction wasn't committed.  That would stink, because the whole
point of having A hold onto locks until the standby ack'd the commit
was that no other transaction would see it as committed until it was
replicated.

This is a pretty unlikely race condition in practice but people who
are running sync rep are intending precisely to guard against unlikely
failure scenarios.

The only idea I have for allowing fast shutdown to still be fast, even
when sync rep is involved, is to shut down the system in two phases.
The postmaster would need to stop accepting new connections, and first
kill off all the backends that aren't waiting for sync rep.  Then,
once all remaining backends are waiting for sync rep, we can have them
proceed as above: close the connection without acking the commit or
throwing ERROR/FATAL, and exit.  That's pretty complicated, especially
given the rule that the postmaster mustn't touch shared memory, but I
don't see any alternative.  We could just not allow fast shutdown, as
now, but I think that's worse.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-15 Thread Fujii Masao

On Wed, Mar 16, 2011 at 11:07 AM, Robert Haas robertmh...@gmail.com wrote:
 The problem is that there may be another backend B waiting on a lock
 held by A.  If backend A exits cleanly (without a PANIC), it will
 remove itself from the ProcArray and release locks.  That wakes up A,
 which can now go do its thing.  If the operating system is a bit on
 the slow side delivering the signal to B, then the client to which B
 is connected might manage to see a database state that shows the
 transaction previous running in A as committed, even though that
 transaction wasn't committed.  That would stink, because the whole
 point of having A hold onto locks until the standby ack'd the commit
 was that no other transaction would see it as committed until it was
 replicated.

The lock can be released also when the transaction running in A is
rollbacked. So I could not understand why the client wrongly always
see the transaction as commtted even though it's not committed.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-11 Thread Fujii Masao

On Fri, Mar 11, 2011 at 5:50 AM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Mar 10, 2011 at 3:29 PM, Dimitri Fontaine
 dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
 they are, but there's no easy way to figure out what that means in
 terms of wall-clock time, which I think would be useful.

 Jan Wieck had a detailed proposal to make that happen at last developper
 meeting, but then ran out of time to implement it for 9.1 it seems.  The
 idea was basically to have a ticker in core, an SRF that would associate
 txid_snapshot with wall clock time.  Lots of good things would come from
 that.

  http://archives.postgresql.org/pgsql-hackers/2010-05/msg01209.php

 Of course if you think that's important enough for you to implement it
 between now and beta, that would be great :)

 I think that's actually something a little different, and more
 complicated, but I do think it'd be useful.  I was hoping there was a
 simple way to get some kind of time-based information into
 pg_stat_replication, but if there isn't, there isn't.

How about sending the timestamp of last applied transaction
(i.e., this is the return value of pg_last_xact_replay_timestamp)
from the standby to the master, and reporting it in
pg_stat_replication? Then you can see the lag by comparing
it with current_timestamp.

But since the last replay timestamp doesn't advance (but
current timestamp advances) if there is no work on the master,
the calculated lag might be unexpectedly too large. So, to
calculate the exact lag, I'm thinking that we should introduce
new function which returns the timestamp of the last transaction
written in the master.

Thought?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-11 Thread Robert Haas

On Fri, Mar 11, 2011 at 7:08 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Fri, Mar 11, 2011 at 5:50 AM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Mar 10, 2011 at 3:29 PM, Dimitri Fontaine
 dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
 they are, but there's no easy way to figure out what that means in
 terms of wall-clock time, which I think would be useful.

 Jan Wieck had a detailed proposal to make that happen at last developper
 meeting, but then ran out of time to implement it for 9.1 it seems.  The
 idea was basically to have a ticker in core, an SRF that would associate
 txid_snapshot with wall clock time.  Lots of good things would come from
 that.

  http://archives.postgresql.org/pgsql-hackers/2010-05/msg01209.php

 Of course if you think that's important enough for you to implement it
 between now and beta, that would be great :)

 I think that's actually something a little different, and more
 complicated, but I do think it'd be useful.  I was hoping there was a
 simple way to get some kind of time-based information into
 pg_stat_replication, but if there isn't, there isn't.

 How about sending the timestamp of last applied transaction
 (i.e., this is the return value of pg_last_xact_replay_timestamp)
 from the standby to the master, and reporting it in
 pg_stat_replication? Then you can see the lag by comparing
 it with current_timestamp.

 But since the last replay timestamp doesn't advance (but
 current timestamp advances) if there is no work on the master,
 the calculated lag might be unexpectedly too large. So, to
 calculate the exact lag, I'm thinking that we should introduce
 new function which returns the timestamp of the last transaction
 written in the master.

 Thought?

Hmm... where would we get that value from?  And what if no
transactions are running on the master?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-11 Thread Fujii Masao

On Fri, Mar 11, 2011 at 10:02 PM, Robert Haas robertmh...@gmail.com wrote:
 How about sending the timestamp of last applied transaction
 (i.e., this is the return value of pg_last_xact_replay_timestamp)
 from the standby to the master, and reporting it in
 pg_stat_replication? Then you can see the lag by comparing
 it with current_timestamp.

 But since the last replay timestamp doesn't advance (but
 current timestamp advances) if there is no work on the master,
 the calculated lag might be unexpectedly too large. So, to
 calculate the exact lag, I'm thinking that we should introduce
 new function which returns the timestamp of the last transaction
 written in the master.

 Thought?

 Hmm... where would we get that value from?

xl_xact_commit-xact_time (which is set in RecordTransactionCommit)
and xl_xact_abort-xact_time (which is set in RecordTransactionAbort).

 And what if no
 transactions are running on the master?

In that case, the last write WAL timestamp would become equal to the
last replay WAL timestamp. So we can see that there is no lag.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-11 Thread Robert Haas

On Fri, Mar 11, 2011 at 8:21 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Fri, Mar 11, 2011 at 10:02 PM, Robert Haas robertmh...@gmail.com wrote:
 How about sending the timestamp of last applied transaction
 (i.e., this is the return value of pg_last_xact_replay_timestamp)
 from the standby to the master, and reporting it in
 pg_stat_replication? Then you can see the lag by comparing
 it with current_timestamp.

 But since the last replay timestamp doesn't advance (but
 current timestamp advances) if there is no work on the master,
 the calculated lag might be unexpectedly too large. So, to
 calculate the exact lag, I'm thinking that we should introduce
 new function which returns the timestamp of the last transaction
 written in the master.

 Thought?

 Hmm... where would we get that value from?

 xl_xact_commit-xact_time (which is set in RecordTransactionCommit)
 and xl_xact_abort-xact_time (which is set in RecordTransactionAbort).

 And what if no
 transactions are running on the master?

 In that case, the last write WAL timestamp would become equal to the
 last replay WAL timestamp. So we can see that there is no lag.

Oh, I see (I think).  You're talking about write/replay lag, but I was
thinking of master/slave transmission lag.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-11 Thread Ross J. Reedstrom

On Fri, Mar 11, 2011 at 09:03:33AM -0500, Robert Haas wrote:
 On Fri, Mar 11, 2011 at 8:21 AM, Fujii Masao masao.fu...@gmail.com wrote:
 
  In that case, the last write WAL timestamp would become equal to the
  last replay WAL timestamp. So we can see that there is no lag.
 
 Oh, I see (I think).  You're talking about write/replay lag, but I was
 thinking of master/slave transmission lag.
 

Which are both useful numbers to know: the first tells you how stale
queries from a Hot Standby will be, the second tells you the maximum
data loss from a meteor hits the master scenario where that slave is
promoted, if I understand all the interactions correctly.

Ross
-- 
Ross Reedstrom, Ph.D. reeds...@rice.edu
Systems Engineer  Admin, Research Scientistphone: 713-348-6166
Connexions  http://cnx.orgfax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E  F888 D3AE 810E 88F0 BEDE

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-10 Thread Robert Haas

On Wed, Mar 9, 2011 at 9:21 PM, Bruce Momjian br...@momjian.us wrote:
 Simon Riggs wrote:
 On Fri, 2011-03-04 at 23:15 +0900, Fujii Masao wrote:

  postgres=# SELECT application_name, state, sync_priority, sync_state
  FROM pg_stat_replication;
   application_name |   state   | sync_priority | sync_state
  --+---+---+
   one              | STREAMING |             1 | POTENTIAL
   two              | STREAMING |             2 | SYNC
  (2 rows)

 Bug! Thanks.

 Is there a reason these status are all upper-case?

Not that I know of.

However, I think that some more fundamental rethinking of the state
mechanism may be in order.  When Magnus first committed this, it would
say CATCHUP whenever you were behind (even if only momentarily) and
STREAMING if you were caught up.  Simon then changed it so that it
says CATCHUP until you catch up the first time, and then STREAMING
afterward (even if you fall behind again).  Neither behavior seems
completely adequate to me.  I think we should have a way to know
whether we've ever been caught up, and if so when the most recent time
was.  So you could then say things like is the most recent time at
which the standby was caught up within the last 30 seconds?, which
would be a useful thing to monitor, and right now there's no way to do
it.  There's also a BACKUP state, but I'm not sure it makes sense to
lump that in with the others.  Some day it might be possible to stream
WAL and take a backup at the same time, over the same connection.
Maybe that should be a separate column or something.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-10 Thread Simon Riggs

On Wed, 2011-03-09 at 21:21 -0500, Bruce Momjian wrote:
 Simon Riggs wrote:
  On Fri, 2011-03-04 at 23:15 +0900, Fujii Masao wrote:
  
   postgres=# SELECT application_name, state, sync_priority, sync_state
   FROM pg_stat_replication;
application_name |   state   | sync_priority | sync_state
   --+---+---+
one  | STREAMING | 1 | POTENTIAL
two  | streaming | 2 | sync
   (2 rows)
  
  Bug! Thanks.
 
 Is there a reason these status are all upper-case?

NOT AS FAR AS I KNOW.

I'll add it to the list of changes for beta.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-10 Thread Dimitri Fontaine

Robert Haas robertmh...@gmail.com writes:
 was.  So you could then say things like is the most recent time at
 which the standby was caught up within the last 30 seconds?, which
 would be a useful thing to monitor, and right now there's no way to do

Well in my experience with replication, that's not what I want to
monitor.  If the standby is synchronous, then it's not catching up, it's
streaming.  If it were not, it would not be a synchronous standby.

When a standby is asynchronous then what I want to monitor is its lag.

So the CATCHUP state is useful to see that a synchronous standby
candidate can not yet be a synchronous standby.  When it just lost its
synchronous status (and hopefully another standby is now the sync one),
then it's just asynchronous and I want to know its lag.

 it.  There's also a BACKUP state, but I'm not sure it makes sense to
 lump that in with the others.  Some day it might be possible to stream
 WAL and take a backup at the same time, over the same connection.
 Maybe that should be a separate column or something.

BACKUP is still meaningful if you stream WAL at the same time, because
you're certainly *not* applying them while doing the base backup, are
you?  So you're not yet a standby, that's what BACKUP means.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-10 Thread Robert Haas

On Thu, Mar 10, 2011 at 2:42 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
 was.  So you could then say things like is the most recent time at
 which the standby was caught up within the last 30 seconds?, which
 would be a useful thing to monitor, and right now there's no way to do

 Well in my experience with replication, that's not what I want to
 monitor.  If the standby is synchronous, then it's not catching up, it's
 streaming.  If it were not, it would not be a synchronous standby.

 When a standby is asynchronous then what I want to monitor is its lag.

 So the CATCHUP state is useful to see that a synchronous standby
 candidate can not yet be a synchronous standby.  When it just lost its
 synchronous status (and hopefully another standby is now the sync one),
 then it's just asynchronous and I want to know its lag.

Yeah, maybe.  The trick is how to measure the lag.  I proposed the
above scheme mostly as a way of giving the user some piece of
information that can be measured in seconds rather than WAL position,
but I'm open to better ideas.  Monitoring is pretty hard to do at all
in 9.0; in 9.1, we'll be able to tell them how many *bytes* behind
they are, but there's no easy way to figure out what that means in
terms of wall-clock time, which I think would be useful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-10 Thread Dimitri Fontaine

Robert Haas robertmh...@gmail.com writes:
 they are, but there's no easy way to figure out what that means in
 terms of wall-clock time, which I think would be useful.

Jan Wieck had a detailed proposal to make that happen at last developper
meeting, but then ran out of time to implement it for 9.1 it seems.  The
idea was basically to have a ticker in core, an SRF that would associate
txid_snapshot with wall clock time.  Lots of good things would come from
that.

  http://archives.postgresql.org/pgsql-hackers/2010-05/msg01209.php

Of course if you think that's important enough for you to implement it
between now and beta, that would be great :)

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-10 Thread Robert Haas

On Thu, Mar 10, 2011 at 3:29 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
 they are, but there's no easy way to figure out what that means in
 terms of wall-clock time, which I think would be useful.

 Jan Wieck had a detailed proposal to make that happen at last developper
 meeting, but then ran out of time to implement it for 9.1 it seems.  The
 idea was basically to have a ticker in core, an SRF that would associate
 txid_snapshot with wall clock time.  Lots of good things would come from
 that.

  http://archives.postgresql.org/pgsql-hackers/2010-05/msg01209.php

 Of course if you think that's important enough for you to implement it
 between now and beta, that would be great :)

I think that's actually something a little different, and more
complicated, but I do think it'd be useful.  I was hoping there was a
simple way to get some kind of time-based information into
pg_stat_replication, but if there isn't, there isn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-09 Thread Yeb Havinga


On 2011-03-09 08:38, Fujii Masao wrote:

On Wed, Mar 9, 2011 at 2:14 PM, Jaime Casanovaja...@2ndquadrant.com  wrote:

On Tue, Mar 8, 2011 at 11:58 AM, Robert Haasrobertmh...@gmail.com  wrote:

The fast shutdown handling seems fine, but why not just handle smart
shutdown the same way?

currently, smart shutdown means no new connections, wait until
existing ones close normally. for consistency, it should behave the
same for sync rep.

Agreed. I think that user who wants to request smart shutdown expects all
the existing connections to basically be closed normally by the client. So it
doesn't seem to be good idea to forcibly close the connection and prevent
the COMMIT from being returned in smart shutdown case. But I'm all ears
for better suggestions.
For me smart has always been synonymous to no forced disconnects/exits, 
or put different, the 'clean' solution, as opposite to the fast and 
unclean shutdown.


An alternative for a clean solution might be to forbid smart shutdown, 
if none of the sync standbys is connected. This would prevent the master 
to enter a state in which a standby cannot connect anymore.


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-09 Thread Magnus Hagander

On Wed, Mar 9, 2011 at 08:38, Fujii Masao masao.fu...@gmail.com wrote:
 On Wed, Mar 9, 2011 at 2:14 PM, Jaime Casanova ja...@2ndquadrant.com wrote:
 On Tue, Mar 8, 2011 at 11:58 AM, Robert Haas robertmh...@gmail.com wrote:

 The fast shutdown handling seems fine, but why not just handle smart
 shutdown the same way?

 currently, smart shutdown means no new connections, wait until
 existing ones close normally. for consistency, it should behave the
 same for sync rep.

 Agreed. I think that user who wants to request smart shutdown expects all
 the existing connections to basically be closed normally by the client. So it
 doesn't seem to be good idea to forcibly close the connection and prevent
 the COMMIT from being returned in smart shutdown case. But I'm all ears
 for better suggestions.

don't use smart shutdowns? ;)

Anyway, for those that *do* use smart intentionally, I agree that
doing any kind of forced close at all is just plain wrong.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-09 Thread Simon Riggs

On Wed, 2011-03-09 at 16:38 +0900, Fujii Masao wrote:
 On Wed, Mar 9, 2011 at 2:14 PM, Jaime Casanova ja...@2ndquadrant.com wrote:
  On Tue, Mar 8, 2011 at 11:58 AM, Robert Haas robertmh...@gmail.com wrote:
 
  The fast shutdown handling seems fine, but why not just handle smart
  shutdown the same way?
 
  currently, smart shutdown means no new connections, wait until
  existing ones close normally. for consistency, it should behave the
  same for sync rep.
 
 Agreed. I think that user who wants to request smart shutdown expects all
 the existing connections to basically be closed normally by the client. So it
 doesn't seem to be good idea to forcibly close the connection and prevent
 the COMMIT from being returned in smart shutdown case. But I'm all ears
 for better suggestions.
 
 Anyway, we got the consensus about how fast shutdown should work with
 sync rep. So I created the patch. Please feel free to comment and commit
 the patch first ;)

We're just about to publish Alpha4 with this feature in.

If we release waiters too early we will cause effective data loss, that
part is agreed. We've also accepted that there are few ways to release
the waiters.

I want to release the first version as safe and then relax from there
after feedback. What I don't want to hear is lots of complaints or
arguments about data loss from the first version. We can relax later,
after some time and thought.

So for now, I don't want to apply this patch or other similar ones that
seek to release waiters in various circumstances. That isn't a rejection
of you, its just a wish to play it safe and slowly.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-09 Thread Yeb Havinga


On 2011-03-09 15:10, Simon Riggs wrote:

On Wed, 2011-03-09 at 16:38 +0900, Fujii Masao wrote:

On Wed, Mar 9, 2011 at 2:14 PM, Jaime Casanovaja...@2ndquadrant.com  wrote:

On Tue, Mar 8, 2011 at 11:58 AM, Robert Haasrobertmh...@gmail.com  wrote:

The fast shutdown handling seems fine, but why not just handle smart
shutdown the same way?

currently, smart shutdown means no new connections, wait until
existing ones close normally. for consistency, it should behave the
same for sync rep.

Agreed. I think that user who wants to request smart shutdown expects all
the existing connections to basically be closed normally by the client. So it
doesn't seem to be good idea to forcibly close the connection and prevent
the COMMIT from being returned in smart shutdown case. But I'm all ears
for better suggestions.

Anyway, we got the consensus about how fast shutdown should work with
sync rep. So I created the patch. Please feel free to comment and commit
the patch first ;)

We're just about to publish Alpha4 with this feature in.

If we release waiters too early we will cause effective data loss, that
part is agreed. We've also accepted that there are few ways to release
the waiters.

I want to release the first version as safe and then relax from there
after feedback.

This is not safe and possible in the first version:

1) issue stop on master when no sync standby is connected:
mgrid@mg73:~$ pg_ctl -D /data stop
waiting for server to shut 
down... failed

pg_ctl: server does not shut down

2) start the standby that failed
mgrid@mg72:~$ pg_ctl -D /data start
pg_ctl: another server might be running; trying to start server anyway
LOG:  0: database system was interrupted while in recovery at log 
time 2011-03-09 15:22:31 CET
HINT:  If this has occurred more than once some data might be corrupted 
and you might need to choose an earlier recovery target.

LOG:  0: entering standby mode
LOG:  0: redo starts at 57/1A78
LOG:  0: consistent recovery state reached at 57/1AA0
FATAL:  XX000: could not connect to the primary server: FATAL:  the 
database system is shutting down


LOCATION:  libpqrcv_connect, libpqwalreceiver.c:102
server starting
mgrid@mg72:~$ FATAL:  XX000: could not connect to the primary server: 
FATAL:  the database system is shutting down


A safe solution would be to prevent smart shutdown on the master if it 
is in sync mode and there are no sync standbys connected.


The current situation is definately unsafe because it forces people that 
are in this state to do a fast shutdown.. but that fails as well, so 
they are only left with immediate.


mgrid@mg73:~$ pg_ctl -D /data stop -m fast
waiting for server to shut 
down... failed

pg_ctl: server does not shut down
mgrid@mg73:~$ pg_ctl -D /data stop -m immediate
waiting for server to shut down done
server stopped

regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-09 Thread Simon Riggs

On Wed, 2011-03-09 at 15:37 +0100, Yeb Havinga wrote:

 The current situation is definately unsafe because it forces people
 that are in this state to do a fast shutdown.. but that fails as well,
 so they are only left with immediate.

All the more reason not to change anything, since we disagree.

The idea is that you're supposed to wait for the standby to come back up
or do failover. If you shutdown the master its because you are choosing
to failover.

Shutting down the master and restarting without failover leads to a
situation where some sync rep commits are not on both master and
standby. Making it easier to shutdown encourages that, which I don't
wish to do, at this stage.

We may decide that this is the right approach but I don't wish to rush
into that decision. I want to have clear agreement about all the changes
we want and what we call them if we do them. Zero data loss is
ultimately about users having confidence in us, not about specific
features. Our disagreements on this patch risk damaging that confidence,
whoever is right/wrong.

Further changes can be made over the course of the next few weeks, based
upon feedback from a wider pool of potential users.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-09 Thread Bruce Momjian

Simon Riggs wrote:
 On Fri, 2011-03-04 at 23:15 +0900, Fujii Masao wrote:
 
  postgres=# SELECT application_name, state, sync_priority, sync_state
  FROM pg_stat_replication;
   application_name |   state   | sync_priority | sync_state
  --+---+---+
   one  | STREAMING | 1 | POTENTIAL
   two  | STREAMING | 2 | SYNC
  (2 rows)
 
 Bug! Thanks.

Is there a reason these status are all upper-case?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-09 Thread Fujii Masao

On Thu, Mar 10, 2011 at 12:03 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Wed, 2011-03-09 at 15:37 +0100, Yeb Havinga wrote:

 The current situation is definately unsafe because it forces people
 that are in this state to do a fast shutdown.. but that fails as well,
 so they are only left with immediate.

I agree with Yeb.

 All the more reason not to change anything, since we disagree.

 The idea is that you're supposed to wait for the standby to come back up
 or do failover. If you shutdown the master its because you are choosing
 to failover.

 Shutting down the master and restarting without failover leads to a
 situation where some sync rep commits are not on both master and
 standby. Making it easier to shutdown encourages that, which I don't
 wish to do, at this stage.

I'm not sure I follow you. The proposed fast shutdown prevents the
backends which have not received the ACK from the standby yet
from returning the success to the client. So even after restarting
the server, there is no data loss from client's point of view. If this is
really unsafe, we *must* forbid immediate shutdown while backend
is waiting for sync rep. Because immediate shutdown creates the
same situation.

What scenario are you concerned?

 We may decide that this is the right approach but I don't wish to rush
 into that decision. I want to have clear agreement about all the changes
 we want and what we call them if we do them. Zero data loss is
 ultimately about users having confidence in us, not about specific
 features. Our disagreements on this patch risk damaging that confidence,
 whoever is right/wrong.

Same as above. I think that it's more problematic to leave the code
as it is. Because smart/fast shutdown can make the server get stuck
until immediate shutdown is requested.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-08 Thread Fujii Masao

On Mon, Mar 7, 2011 at 4:54 AM, Robert Haas robertmh...@gmail.com wrote:
 On Mar 6, 2011, at 9:44 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Sun, Mar 6, 2011 at 5:02 PM, Yeb Havinga yebhavi...@gmail.com wrote:
 On Sun, Mar 6, 2011 at 8:58 AM, Fujii Masao masao.fu...@gmail.com wrote:

 If unfortunately all connection slots are used by backends waiting for
 replication, we cannot execute such a function. So it makes more sense
 to introduce something like pg_ctl standalone command?

 If it is only for shutdown, maybe pg_ctl stop -m standalone?

 It's for not only shutdown but also running the primary in standalone mode.
 So something like pg_ctl standalone is better.

 For now I think that pg_ctl command is better than built-in function because
 sometimes we might want to wake waiters up even during shutdown in
 order to cause shutdown to end. During shutdown, the server doesn't
 accept any new connection (even from the standby). So, without something
 like pg_ctl standalone, there is no way to cause shutdown to end.

 This sounds like an awful hack to work around a bad design. Surely once 
 shutdown reaches a point where new replication connections can no longer be 
 accepted, any standbys hung on commit need to close the connection without 
 responding to the COMMIT, per previous discussion.  It's completely 
 unreasonable for sync rep to break the shutdown sequence.

Yeah, let's think about how shutdown should work. I'd like to propose the
following. Thought?

* Smart shutdown
Smart shutdown should wait for all the waiting backends to be acked, and
should not cause them to forcibly exit. But this leads shutdown to get stuck
infinitely if there is no walsender at that time. To enable them to be acked
even in that situation, we need to change postmaster so that it accepts the
replication connection even during smart shutdown (until we reach
PM_SHUTDOWN_2 state). Postmaster has already accepted the superuser
connection to cancel backup during smart shutdown. So I don't think that
the idea to accept the replication connection during smart shutdown is so
ugly.

* Fast shutdown
I agree with you about fast shutdown. Fast shutdown should cause all the
backends including waiting ones to exit immediately. At that time, the
non-acked backend should not return the success, according to the
definition of sync rep. So we need to change a backend so that it gets rid
of itself from the waiting queue and exits before returning the success,
when it receives SIGTERM. This change leads the waiting backends to
do the same even when pg_terminate_backend is called. But since
they've not been acked yet, it seems to be reasonable to prevent them
from returning the COMMIT.

Comments? I'll create the patch barring objection.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-08 Thread Robert Haas

On Tue, Mar 8, 2011 at 7:05 AM, Fujii Masao masao.fu...@gmail.com wrote:
 Yeah, let's think about how shutdown should work. I'd like to propose the
 following. Thought?

 * Smart shutdown
 Smart shutdown should wait for all the waiting backends to be acked, and
 should not cause them to forcibly exit. But this leads shutdown to get stuck
 infinitely if there is no walsender at that time. To enable them to be acked
 even in that situation, we need to change postmaster so that it accepts the
 replication connection even during smart shutdown (until we reach
 PM_SHUTDOWN_2 state). Postmaster has already accepted the superuser
 connection to cancel backup during smart shutdown. So I don't think that
 the idea to accept the replication connection during smart shutdown is so
 ugly.

 * Fast shutdown
 I agree with you about fast shutdown. Fast shutdown should cause all the
 backends including waiting ones to exit immediately. At that time, the
 non-acked backend should not return the success, according to the
 definition of sync rep. So we need to change a backend so that it gets rid
 of itself from the waiting queue and exits before returning the success,
 when it receives SIGTERM. This change leads the waiting backends to
 do the same even when pg_terminate_backend is called. But since
 they've not been acked yet, it seems to be reasonable to prevent them
 from returning the COMMIT.

The fast shutdown handling seems fine, but why not just handle smart
shutdown the same way?  I don't really like the idea of allowing
replication connections for longer, and the idea that we don't want to
keep waiting for a commit ACK once we're past the point where it's
possible for one to occur seems to apply generically to any shutdown
sequence.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-08 Thread Jaime Casanova

On Tue, Mar 8, 2011 at 11:58 AM, Robert Haas robertmh...@gmail.com wrote:

 The fast shutdown handling seems fine, but why not just handle smart
 shutdown the same way?

currently, smart shutdown means no new connections, wait until
existing ones close normally. for consistency, it should behave the
same for sync rep.

+1 for Fujii's proposal

-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Sync Rep and shutdown Re: [HACKERS] Sync Rep v19

2011-03-08 Thread Fujii Masao

On Wed, Mar 9, 2011 at 2:14 PM, Jaime Casanova ja...@2ndquadrant.com wrote:
 On Tue, Mar 8, 2011 at 11:58 AM, Robert Haas robertmh...@gmail.com wrote:

 The fast shutdown handling seems fine, but why not just handle smart
 shutdown the same way?

 currently, smart shutdown means no new connections, wait until
 existing ones close normally. for consistency, it should behave the
 same for sync rep.

Agreed. I think that user who wants to request smart shutdown expects all
the existing connections to basically be closed normally by the client. So it
doesn't seem to be good idea to forcibly close the connection and prevent
the COMMIT from being returned in smart shutdown case. But I'm all ears
for better suggestions.

Anyway, we got the consensus about how fast shutdown should work with
sync rep. So I created the patch. Please feel free to comment and commit
the patch first ;)

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


fast_shutdown_syncrep_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync rep doc corrections

2011-03-07 Thread Thom Brown

On 7 March 2011 15:27, Thom Brown t...@linux.com wrote:
 I've attached a small patch with a bit of clarification and a typo fix
 in the synchronous_standby_names parameter info.

Okay, I've noticed that the main documentation also needed some fixes,
so those have been included in this new patch.

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935


sync_rep_doc_fix.v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-07 Thread Yeb Havinga


On 2011-03-07 01:37, Simon Riggs wrote:

On Sat, 2011-03-05 at 21:11 +0100, Yeb Havinga wrote:


I also got a first first  1000 tps score

The committed version should be even faster. Would appreciate a retest.


pgbench 5 minute test pgbench -c 10 -M prepared -T 300 test
dbsize was -s 50, 1Gbit Ethernet

1 async standby
tps = 2475.285931 (excluding connections establishing)

2 async standbys
tps = 2333.670561 (excluding connections establishing)

1 sync standby
tps = 1277.082753 (excluding connections establishing)

1 sync, 1 async standby
tps = 1273.317386 (excluding connections establishing)

Hard for me to not revert to superlatives right now! :-)

regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-07 Thread Simon Riggs

On Mon, 2011-03-07 at 14:20 +0100, Yeb Havinga wrote:
 On 2011-03-07 01:37, Simon Riggs wrote:
  On Sat, 2011-03-05 at 21:11 +0100, Yeb Havinga wrote:
 
  I also got a first first  1000 tps score
  The committed version should be even faster. Would appreciate a retest.
 
 pgbench 5 minute test pgbench -c 10 -M prepared -T 300 test
 dbsize was -s 50, 1Gbit Ethernet
 
 1 async standby
 tps = 2475.285931 (excluding connections establishing)
 
 2 async standbys
 tps = 2333.670561 (excluding connections establishing)
 
 1 sync standby
 tps = 1277.082753 (excluding connections establishing)
 
 1 sync, 1 async standby
 tps = 1273.317386 (excluding connections establishing)
 
 Hard for me to not revert to superlatives right now! :-)

That looks like good news, thanks.

It shows that sync rep is fairly fast, but it also shows clearly why
you'd want to mix sync and async replication within an application.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: How should the primary behave when the sync standby goes away? Re: [HACKERS] Sync Rep v17

2011-03-07 Thread Robert Haas

On Sun, Mar 6, 2011 at 5:36 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Fri, 2011-03-04 at 16:57 +0900, Fujii Masao wrote:
 On Wed, Mar 2, 2011 at 11:30 PM, Fujii Masao masao.fu...@gmail.com wrote:
  On Wed, Mar 2, 2011 at 8:22 PM, Simon Riggs si...@2ndquadrant.com wrote:
  The WALSender deliberately does *not* wake waiting users if the standby
  disconnects. Doing so would break the whole reason for having sync rep
  in the first place. What we do is allow a potential standby to takeover
  the role of sync standby, if one is available. Or the failing standby
  can reconnect and then release waiters.
 
  If there is potential standby when synchronous standby has gone, I agree
  that it's not good idea to release the waiting backends soon. In this case,
  those backends should wait for next synchronous standby.
 
  On the other hand, if there is no potential standby, I think that the 
  waiting
  backends should not wait for the timeout and should wake up as soon as
  synchronous standby has gone. Otherwise, those backends suspend for
  a long time (i.e., until the timeout expires), which would decrease the
  high-availability, I'm afraid.
 
  Keeping those backends waiting for the failed standby to reconnect is an
  idea. But this looks like the behavior for allow_standalone_primary = 
  off.
  If allow_standalone_primary = on, it looks more natural to make the
  primary work alone without waiting the timeout.

 Also I think that the waiting backends should be released as soon as the
 last synchronous standby switches to asynchronous mode. Since there is
 no standby which is planning to reconnect, obviously they no longer need
 to wait.

 I've not done this, but we could.

 It can't run in a WALSender, so this code would need to live in either
 WALWriter or BgWriter.

I would have thought that the last WALSender to switch to async would
have been responsible for doing this at that time.  Why doesn't that
work?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: How should the primary behave when the sync standby goes away? Re: [HACKERS] Sync Rep v17

2011-03-07 Thread Simon Riggs

On Mon, 2011-03-07 at 13:15 -0500, Robert Haas wrote:
 On Sun, Mar 6, 2011 at 5:36 PM, Simon Riggs si...@2ndquadrant.com wrote:
  On Fri, 2011-03-04 at 16:57 +0900, Fujii Masao wrote:
  On Wed, Mar 2, 2011 at 11:30 PM, Fujii Masao masao.fu...@gmail.com wrote:
   On Wed, Mar 2, 2011 at 8:22 PM, Simon Riggs si...@2ndquadrant.com 
   wrote:

 
  Also I think that the waiting backends should be released as soon as the
  last synchronous standby switches to asynchronous mode. Since there is
  no standby which is planning to reconnect, obviously they no longer need
  to wait.
 
  I've not done this, but we could.
 
  It can't run in a WALSender, so this code would need to live in either
  WALWriter or BgWriter.
 
 I would have thought that the last WALSender to switch to async would
 have been responsible for doing this at that time.  Why doesn't that
 work?

The main time we get extended waits is when there are no WALsenders.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-06 Thread Simon Riggs

On Sun, 2011-03-06 at 14:27 +0900, Fujii Masao wrote:
 On Sun, Mar 6, 2011 at 2:59 AM, Robert Haas robertmh...@gmail.com wrote:
  On Sat, Mar 5, 2011 at 11:56 AM, Simon Riggs si...@2ndquadrant.com wrote:
  Even though postmaster dies, the waiting backend keeps waiting until
  the timeout expires. Instead, the backends should periodically check
  whether postmaster is alive, and then they should exit immediately
  if it's not alive, as well as other process does? If the timeout is
  disabled, such backends would get stuck infinitely.
 
  Will wake them every 60 seconds
 
  I don't really see why sync rep should be responsible for solving this
  problem, which is an issue in many other situations as well, only for
  itself. In fact I think I'd prefer that it didn't, and that we wait
  for a more general solution that will actually fix this problem for
  real.
 
 I agree if such a general solution will be committed together with sync rep.
 Otherwise, because of sync rep, the backend can easily get stuck *infinitely*.
 When postmaster is not alive, all the existing walsenders exit immediately
 and no new walsender can appear. So there is no way to release the
 waiting backend. I think that some solutions for this issue which is likely to
 happen are required.

Completely agree.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-06 Thread Simon Riggs

On Sun, 2011-03-06 at 16:58 +0900, Fujii Masao wrote:
 On Sun, Mar 6, 2011 at 4:51 PM, Fujii Masao masao.fu...@gmail.com wrote:
  One comment; what about introducing built-in function to wake up all the
  waiting backends? When replication connection is closed, if we STONITH
  the standby, we can safely (for not physical data loss but logical one)
  switch the primary to standalone mode. But there is no way to wake up
  the waiting backends for now. Setting synchronous_replication to OFF
  and reloading the configuration file doesn't affect the existing waiting
  backends. The attached patch introduces the pg_wakeup_all_waiters
  (better name?) function which wakes up all the backends on the queue.
 
 If unfortunately all connection slots are used by backends waiting for
 replication, we cannot execute such a function. So it makes more sense
 to introduce something like pg_ctl standalone command?

Well, there is one way to end the wait: shutdown, or use
pg_terminate_backend().

If you simply end the wait you will get COMMIT messages.

What I would like to do is commit the safe patch now. We can then
discuss whether it is safe and desirable to relax some aspects of that
during beta.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-06 Thread Simon Riggs

On Sun, 2011-03-06 at 01:58 +0900, Fujii Masao wrote:
 On Sun, Mar 6, 2011 at 12:42 AM, Fujii Masao masao.fu...@gmail.com wrote:
  New comments;
 
 Another one;
 
 + longtimeout = SyncRepGetWaitTimeout();
 snip
 + else if (timeout  0 
 + TimestampDifferenceExceeds(wait_start, now, 
 timeout))
 + {
 
 The third argument of TimestampDifferenceExceeds is
 expressed in milliseconds. But you wrongly give that the
 microseconds.

Just for the record: that code section is now removed in v21

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-06 Thread Fujii Masao

On Sun, Mar 6, 2011 at 5:26 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Sun, 2011-03-06 at 16:58 +0900, Fujii Masao wrote:
 On Sun, Mar 6, 2011 at 4:51 PM, Fujii Masao masao.fu...@gmail.com wrote:
  One comment; what about introducing built-in function to wake up all the
  waiting backends? When replication connection is closed, if we STONITH
  the standby, we can safely (for not physical data loss but logical one)
  switch the primary to standalone mode. But there is no way to wake up
  the waiting backends for now. Setting synchronous_replication to OFF
  and reloading the configuration file doesn't affect the existing waiting
  backends. The attached patch introduces the pg_wakeup_all_waiters
  (better name?) function which wakes up all the backends on the queue.

 If unfortunately all connection slots are used by backends waiting for
 replication, we cannot execute such a function. So it makes more sense
 to introduce something like pg_ctl standalone command?

 Well, there is one way to end the wait: shutdown, or use
 pg_terminate_backend().

Immediate shutdown can release the wait. But smart and fast shutdown
cannot. Also pg_terminate_backend() cannot. Since a backend is waiting
on the latch and InterruptHoldoffCount is not zero, only SetLatch() or
SIGQUIT can cause it to end.

 If you simply end the wait you will get COMMIT messages.

 What I would like to do is commit the safe patch now. We can then
 discuss whether it is safe and desirable to relax some aspects of that
 during beta.

OK if changing some aspects is acceptable during beta.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-06 Thread Fujii Masao

On Sun, Mar 6, 2011 at 5:02 PM, Yeb Havinga yebhavi...@gmail.com wrote:
 On Sun, Mar 6, 2011 at 8:58 AM, Fujii Masao masao.fu...@gmail.com wrote:

 If unfortunately all connection slots are used by backends waiting for
 replication, we cannot execute such a function. So it makes more sense
 to introduce something like pg_ctl standalone command?

 If it is only for shutdown, maybe pg_ctl stop -m standalone?

It's for not only shutdown but also running the primary in standalone mode.
So something like pg_ctl standalone is better.

For now I think that pg_ctl command is better than built-in function because
sometimes we might want to wake waiters up even during shutdown in
order to cause shutdown to end. During shutdown, the server doesn't
accept any new connection (even from the standby). So, without something
like pg_ctl standalone, there is no way to cause shutdown to end.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-06 Thread Jaime Casanova

El 06/03/2011 03:26, Simon Riggs si...@2ndquadrant.com escribió:

 On Sun, 2011-03-06 at 16:58 +0900, Fujii Masao wrote:

  If unfortunately all connection slots are used by backends waiting for
  replication, we cannot execute such a function. So it makes more sense
  to introduce something like pg_ctl standalone command?

 Well, there is one way to end the wait: shutdown, or use
 pg_terminate_backend().


I disconnected all standbys so the master keeps waiting on commit. Then i
shutdown the master with immediate and got a crash. i wasn't able to trace
that though

--
Jaime Casanovawww.2ndQuadrant.com

Re: [HACKERS] Sync Rep v19

2011-03-06 Thread Robert Haas

On Mar 6, 2011, at 9:44 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Sun, Mar 6, 2011 at 5:02 PM, Yeb Havinga yebhavi...@gmail.com wrote:
 On Sun, Mar 6, 2011 at 8:58 AM, Fujii Masao masao.fu...@gmail.com wrote:
 
 If unfortunately all connection slots are used by backends waiting for
 replication, we cannot execute such a function. So it makes more sense
 to introduce something like pg_ctl standalone command?
 
 If it is only for shutdown, maybe pg_ctl stop -m standalone?
 
 It's for not only shutdown but also running the primary in standalone mode.
 So something like pg_ctl standalone is better.
 
 For now I think that pg_ctl command is better than built-in function because
 sometimes we might want to wake waiters up even during shutdown in
 order to cause shutdown to end. During shutdown, the server doesn't
 accept any new connection (even from the standby). So, without something
 like pg_ctl standalone, there is no way to cause shutdown to end.

This sounds like an awful hack to work around a bad design. Surely once 
shutdown reaches a point where new replication connections can no longer be 
accepted, any standbys hung on commit need to close the connection without 
responding to the COMMIT, per previous discussion.  It's completely 
unreasonable for sync rep to break the shutdown sequence.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-06 Thread Simon Riggs

On Sun, 2011-03-06 at 16:51 +0900, Fujii Masao wrote:

 One comment; what about introducing built-in function to wake up all the
 waiting backends? When replication connection is closed, if we STONITH
 the standby, we can safely (for not physical data loss but logical one)
 switch the primary to standalone mode. But there is no way to wake up
 the waiting backends for now. Setting synchronous_replication to OFF
 and reloading the configuration file doesn't affect the existing waiting
 backends. The attached patch introduces the pg_wakeup_all_waiters
 (better name?) function which wakes up all the backends on the queue.

Will apply this as a separate commit.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: How should the primary behave when the sync standby goes away? Re: [HACKERS] Sync Rep v17

2011-03-06 Thread Simon Riggs

On Fri, 2011-03-04 at 16:57 +0900, Fujii Masao wrote: 
 On Wed, Mar 2, 2011 at 11:30 PM, Fujii Masao masao.fu...@gmail.com wrote:
  On Wed, Mar 2, 2011 at 8:22 PM, Simon Riggs si...@2ndquadrant.com wrote:
  The WALSender deliberately does *not* wake waiting users if the standby
  disconnects. Doing so would break the whole reason for having sync rep
  in the first place. What we do is allow a potential standby to takeover
  the role of sync standby, if one is available. Or the failing standby
  can reconnect and then release waiters.
 
  If there is potential standby when synchronous standby has gone, I agree
  that it's not good idea to release the waiting backends soon. In this case,
  those backends should wait for next synchronous standby.
 
  On the other hand, if there is no potential standby, I think that the 
  waiting
  backends should not wait for the timeout and should wake up as soon as
  synchronous standby has gone. Otherwise, those backends suspend for
  a long time (i.e., until the timeout expires), which would decrease the
  high-availability, I'm afraid.
 
  Keeping those backends waiting for the failed standby to reconnect is an
  idea. But this looks like the behavior for allow_standalone_primary = off.
  If allow_standalone_primary = on, it looks more natural to make the
  primary work alone without waiting the timeout.
 
 Also I think that the waiting backends should be released as soon as the
 last synchronous standby switches to asynchronous mode. Since there is
 no standby which is planning to reconnect, obviously they no longer need
 to wait.

I've not done this, but we could.

It can't run in a WALSender, so this code would need to live in either
WALWriter or BgWriter.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-06 Thread Simon Riggs

On Sat, 2011-03-05 at 21:11 +0100, Yeb Havinga wrote:

 I also got a first first  1000 tps score

The committed version should be even faster. Would appreciate a retest.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Simon Riggs

On Sat, 2011-03-05 at 16:13 +0900, Fujii Masao wrote:
 On Sat, Mar 5, 2011 at 7:28 AM, Simon Riggs si...@2ndquadrant.com wrote:
  Almost-working patch attached for the above feature. Time to stop for
  the day. Patch against current repo version.
 
  Current repo version attached here also (v20), which includes all fixes
  to all known technical issues, major polishing etc..
 
 Thanks for the patch. Now the code about the wait list looks very
 simpler than before! Here are the comments:
 
 @@ -1337,6 +1352,31 @@ pg_stat_get_wal_senders(PG_FUNCTION_ARGS)
 snip
 + if (walsnd-pid != 0  walsnd-state == WALSNDSTATE_STREAMING)
 + {
 + sync_priority[i] = walsnd-sync_standby_priority;
 
 This always reports the priority of walsender in CATCHUP state as 0.
 I don't think that priority needs to be reported as 0.

Cosmetic change. We can do this, yes.

 When new standby which has the same priority as current sync standby
 connects, that new standby can switch to new sync one even though
 current one is still running. This happens when the index of WalSnd slot
 which new standby uses is ahead of that which current one uses. People
 don't expect such an unexpected switchover, I think.

It is documented that the selection of standby from a set of similar
priorities is indeterminate. Users don't like it, they can change it.

 + /*
 +  * Assume the queue is ordered by LSN
 +  */
 + if (XLByteLT(walsndctl-lsn, proc-waitLSN))
 + return numprocs;
 
 The code to ensure the assumption needs to be added.

Yes, just need to add the code for traversing list backwards.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Fujii Masao

On Sat, Mar 5, 2011 at 7:28 AM, Simon Riggs si...@2ndquadrant.com wrote:
 Yes, that can happen. As people will no doubt observe, this seems to be
 an argument for wait-forever. What we actually need is a wait that lasts
 longer than it takes for us to decide to failover, if the standby is
 actually up and this is some kind of split brain situation. That way the
 clients are still waiting when failover occurs. WAL is missing, but
 since we didn't acknowledge the client we are OK to treat that situation
 as if it were an abort.

Oracle Data Guard in the maximum availability mode behaves that way?

I'm sure that you are implementing something like the maximum availability
mode rather than the maximum protection one. So I'd like to know how
the data loss situation I described can be avoided in the maximum availability
mode.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Simon Riggs

On Sat, 2011-03-05 at 11:04 +, Simon Riggs wrote:
  + /*
  +  * Assume the queue is ordered by LSN
  +  */
  + if (XLByteLT(walsndctl-lsn, proc-waitLSN))
  + return numprocs;
  
  The code to ensure the assumption needs to be added.
 
 Yes, just need to add the code for traversing list backwards.

I've added code to shmqueue.c to allow this.

New version pushed.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v19

2011-03-05 Thread Robert Haas

On Sat, Mar 5, 2011 at 6:04 AM, Simon Riggs si...@2ndquadrant.com wrote:
 It is documented that the selection of standby from a set of similar
 priorities is indeterminate. Users don't like it, they can change it.

That doesn't seem like a good argument to *change* the synchronous
standby once it's already set.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 3 4 5 6 7 >

1 - 100 of 676 matches

Mail list logo