Re: Server won't start with fallback setting by initdb.

2018-03-08 Thread Tom Lane
Robert Haas  writes:
> On Wed, Mar 7, 2018 at 6:43 PM, Michael Paquier  wrote:
>> On Wed, Mar 07, 2018 at 06:39:32PM -0500, Tom Lane wrote:
>>> OK, seems like I'm on the short end of that vote.  I propose to push the
>>> GUC-crosschecking patch I posted yesterday, but not the default-value
>>> change, and instead push Kyotaro-san's initdb change.  Should we back-patch
>>> these things to v10 where the problem appeared?

>> I would vote for a backpatch.  If anybody happens to run initdb on v10
>> and gets max_connections to 10, that would immediately cause a failure.
>> We could also wait for sombody to actually complain about that, but a
>> bit of prevention does not hurt to ease future user experience on this
>> released version.

> In theory, back-patching the GUC-crosschecking patch could cause the
> cluster to fail to restart after the upgrade.  It's pretty unlikely.
> We have to postulate someone with, say, default values but for
> max_connections=12.  But it's not impossible.  I would be inclined to
> back-patch the increase in the max_connections fallback from 10 to 20
> because that fixes a real, if unlikely, failure mode, but treat the
> GUC cross-checking stuff as a master-only improvement.  Although it's
> unlikely to hurt many people, there's no real upside.  Nobody is going
> to say "boy, it's a good thing they tidied that GUC cross-checking in
> the latest major release -- that really saved my bacon!".  Nothing is
> really broken as things stand.

Done that way.  I concur that there's little reason to back-patch
the cross-check change before v10, since the case was even less likely
to happen back when max_wal_senders defaulted to zero.  There's some
argument for changing it in v10, but avoiding thrashing translatable
strings in a released branch probably outweighs it.

regards, tom lane



Re: Server won't start with fallback setting by initdb.

2018-03-07 Thread Robert Haas
On Wed, Mar 7, 2018 at 6:43 PM, Michael Paquier  wrote:
> On Wed, Mar 07, 2018 at 06:39:32PM -0500, Tom Lane wrote:
>> OK, seems like I'm on the short end of that vote.  I propose to push the
>> GUC-crosschecking patch I posted yesterday, but not the default-value
>> change, and instead push Kyotaro-san's initdb change.  Should we back-patch
>> these things to v10 where the problem appeared?
>
> I would vote for a backpatch.  If anybody happens to run initdb on v10
> and gets max_connections to 10, that would immediately cause a failure.
> We could also wait for sombody to actually complain about that, but a
> bit of prevention does not hurt to ease future user experience on this
> released version.

In theory, back-patching the GUC-crosschecking patch could cause the
cluster to fail to restart after the upgrade.  It's pretty unlikely.
We have to postulate someone with, say, default values but for
max_connections=12.  But it's not impossible.  I would be inclined to
back-patch the increase in the max_connections fallback from 10 to 20
because that fixes a real, if unlikely, failure mode, but treat the
GUC cross-checking stuff as a master-only improvement.  Although it's
unlikely to hurt many people, there's no real upside.  Nobody is going
to say "boy, it's a good thing they tidied that GUC cross-checking in
the latest major release -- that really saved my bacon!".  Nothing is
really broken as things stand.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Server won't start with fallback setting by initdb.

2018-03-07 Thread Michael Paquier
On Wed, Mar 07, 2018 at 06:39:32PM -0500, Tom Lane wrote:
> OK, seems like I'm on the short end of that vote.  I propose to push the
> GUC-crosschecking patch I posted yesterday, but not the default-value
> change, and instead push Kyotaro-san's initdb change.  Should we back-patch
> these things to v10 where the problem appeared?

I would vote for a backpatch.  If anybody happens to run initdb on v10
and gets max_connections to 10, that would immediately cause a failure.
We could also wait for sombody to actually complain about that, but a
bit of prevention does not hurt to ease future user experience on this
released version.
--
Michael


signature.asc
Description: PGP signature


Re: Server won't start with fallback setting by initdb.

2018-03-07 Thread Tom Lane
Robert Haas  writes:
> On Tue, Mar 6, 2018 at 10:51 PM, Stephen Frost  wrote:
>> Changing the defaults to go back down strikes me as an entirely wrong
>> approach after we've had a release with the higher defaults without
>> seriously compelling arguments against, and I don't agree that we've had
>> such a case made here.

> +1.  I don't see any real downside of increasing the minimum value of
> max_connections to 20.  I wasn't particularly a fan of raising
> max_wal_senders to 10, but a lot of other people were, and so far
> nobody's reported any problems related to that setting (that I know
> about).

OK, seems like I'm on the short end of that vote.  I propose to push the
GUC-crosschecking patch I posted yesterday, but not the default-value
change, and instead push Kyotaro-san's initdb change.  Should we back-patch
these things to v10 where the problem appeared?

regards, tom lane



Re: Server won't start with fallback setting by initdb.

2018-03-06 Thread Stephen Frost
Greetings Tom Peter, all,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Peter Eisentraut  writes:
> > On 3/4/18 15:31, Tom Lane wrote:
> >> Then, seeing that the factory defaults are ReservedBackends = 3 and
> >> max_wal_senders = 10, something's got to give; there's no way that
> >> max_connections = 10 can work with those.  But what I would argue is that
> >> of those three choices, the least defensible one is max_wal_senders = 10.
> >> Where did that come from?
> 
> > Let's see.  A typical installation might need:
> 
> > 1 for pg_receivewal for continuous backup
> > 2 for pg_basebackup
> > 2 for if pg_basebackup gets interrupted and it takes 2 hours to free the
> > TCP/IP connections
> > 1 for a standby connection
> > 1 for a second standby connection, for making infrastructure changes
> 
> That's "typical"?  It sounds like a major installation to me, one that
> would certainly have had to fool with more settings than just
> max_wal_senders.  Two concurrent pg_basebackups running at all times
> seems particularly dubious.
> 
> If we drop the assumption of 2 concurrent pg_basebackups, then your
> math would lead to a value of 5, which I'd be OK with.

Changing the defaults to go back down strikes me as an entirely wrong
approach after we've had a release with the higher defaults without
seriously compelling arguments against, and I don't agree that we've had
such a case made here.  If this discussion had happened before v10 was
released, I'd be much more open to going with the suggestion of '5', but
forcing users to update their configs for working environments because
we've decided that the default of 10 was too high is just pedantry, in
my opinion.

The original patch proposed strikes me as entirely reasonable, though
given the rarity of seeing a max_connections below 100 in the wild,
unlikely to have any impact on real users.  On the other hand, changing
this default setting will actively *break* user environments which are
working today for very questionable benefit- and in a difficult to
realize manner.  A user could pg_upgrade and not have any immediate
issues until a weekly cronjob fails or similar.  The current value is in
the wild and we've not had reports of performance issues, as best as I
can recall, and in reviewing the various places where max_wal_senders is
used, it seems unlikely that a value of 10 is going to be seriously
worse than a value of 5.  If such a case arises, which seems very likely
to be the exception rather than the rule, encouraging them to reduce
max_wal_senders is a straight-forward answer.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Server won't start with fallback setting by initdb.

2018-03-06 Thread Tom Lane
Peter Eisentraut  writes:
> On 3/4/18 15:31, Tom Lane wrote:
>> Then, seeing that the factory defaults are ReservedBackends = 3 and
>> max_wal_senders = 10, something's got to give; there's no way that
>> max_connections = 10 can work with those.  But what I would argue is that
>> of those three choices, the least defensible one is max_wal_senders = 10.
>> Where did that come from?

> Let's see.  A typical installation might need:

> 1 for pg_receivewal for continuous backup
> 2 for pg_basebackup
> 2 for if pg_basebackup gets interrupted and it takes 2 hours to free the
> TCP/IP connections
> 1 for a standby connection
> 1 for a second standby connection, for making infrastructure changes

That's "typical"?  It sounds like a major installation to me, one that
would certainly have had to fool with more settings than just
max_wal_senders.  Two concurrent pg_basebackups running at all times
seems particularly dubious.

If we drop the assumption of 2 concurrent pg_basebackups, then your
math would lead to a value of 5, which I'd be OK with.

regards, tom lane



Re: Server won't start with fallback setting by initdb.

2018-03-06 Thread Tom Lane
I wrote:
> Therefore, the condition that actually ought to be getting enforced here
> is either "ReservedBackends + max_wal_senders < MaxConnections", or
> "ReservedBackends + max_wal_senders <= MaxConnections", depending on
> whether you think it's appropriate to require at least one not-reserved-
> for-superusers connection slot to remain available if the walsenders
> slots are fully populated.

I propose the first attached patch to do that.  (I failed to resist the
temptation to copy-edit some nearby docs and comments, too.)

> My proposal is to default max_wal_senders to perhaps 3, and leave
> initdb's logic alone.

... and then the second attached patch to do that.

I noticed that a lot of our TAP tests are setting max_wal_senders
and max_replication_slots to random values around 4 or 5.  Probably
we could drop all that now, and let them just use the defaults.
I've not done that here, except for adjusting 010_pg_basebackup.pl
which would fail for no very good reason with minimal max_connections.

regards, tom lane

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 259a2d8..3a8fc7d 100644
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
*** include_dir 'conf.d'
*** 696,703 
  
 
  The default value is three connections. The value must be less
! than the value of max_connections. This
! parameter can only be set at server start.
 

   
--- 696,704 
  
 
  The default value is three connections. The value must be less
! than max_connections minus
! .
! This parameter can only be set at server start.
 

   
*** include_dir 'conf.d'
*** 2982,2994 
  maximum number of simultaneously running WAL sender
  processes). The default is 10. The value 0 means replication is
  disabled. WAL sender processes count towards the total number
! of connections, so the parameter cannot be set higher than
! .  Abrupt streaming client
! disconnection might cause an orphaned connection slot until
  a timeout is reached, so this parameter should be set slightly
  higher than the maximum number of expected clients so disconnected
  clients can immediately reconnect.  This parameter can only
! be set at server start. wal_level must be set to
  replica or higher to allow connections from standby
  servers.
 
--- 2983,2998 
  maximum number of simultaneously running WAL sender
  processes). The default is 10. The value 0 means replication is
  disabled. WAL sender processes count towards the total number
! of connections, so this parameter's value must be less than
!  minus
! .
! Abrupt streaming client disconnection might leave an orphaned
! connection slot behind until
  a timeout is reached, so this parameter should be set slightly
  higher than the maximum number of expected clients so disconnected
  clients can immediately reconnect.  This parameter can only
! be set at server start.
! Also, wal_level must be set to
  replica or higher to allow connections from standby
  servers.
 
*** include_dir 'conf.d'
*** 3007,3016 
   (see ) that the server
   can support. The default is 10.  This parameter can only be set at
   server start.
!  wal_level must be set
!  to replica or higher to allow replication slots to
!  be used. Setting it to a lower value than the number of currently
   existing replication slots will prevent the server from starting.
  
 

--- 3011,3021 
   (see ) that the server
   can support. The default is 10.  This parameter can only be set at
   server start.
!  Setting it to a lower value than the number of currently
   existing replication slots will prevent the server from starting.
+  Also, wal_level must be set
+  to replica or higher to allow replication slots to
+  be used.
  
 

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index f3ddf82..660f318 100644
*** a/src/backend/postmaster/postmaster.c
--- b/src/backend/postmaster/postmaster.c
*** char	   *ListenAddresses;
*** 202,210 
  
  /*
   * ReservedBackends is the number of backends reserved for superuser use.
!  * This number is taken out of the pool size given by MaxBackends so
   * number of backend slots available to non-superusers is
!  * (MaxBackends - ReservedBackends).  Note what this really means is
   * "if there are <= ReservedBackends connections available, only superusers
   * can make new connections" --- pre-existing superuser connections don't
   * count 

Re: Server won't start with fallback setting by initdb.

2018-03-06 Thread Peter Eisentraut
On 3/4/18 15:31, Tom Lane wrote:
> Then, seeing that the factory defaults are ReservedBackends = 3 and
> max_wal_senders = 10, something's got to give; there's no way that
> max_connections = 10 can work with those.  But what I would argue is that
> of those three choices, the least defensible one is max_wal_senders = 10.
> Where did that come from?

Let's see.  A typical installation might need:

1 for pg_receivewal for continuous backup
2 for pg_basebackup
2 for if pg_basebackup gets interrupted and it takes 2 hours to free the
TCP/IP connections
1 for a standby connection
1 for a second standby connection, for making infrastructure changes

The purpose of raising the defaults to 10 was so that most users
wouldn't need to make changes, making it easier to access "proper"
backups and HA.  By my account, if the default were less than 7, the
setting would be insufficient to satisfy that goal.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Server won't start with fallback setting by initdb.

2018-03-04 Thread Tom Lane
Michael Paquier  writes:
> On Sun, Mar 04, 2018 at 03:31:31PM -0500, Tom Lane wrote:
>> ... But what I would argue is that
>> of those three choices, the least defensible one is max_wal_senders = 10.
>> Where did that come from?  What fraction of real-world installations will
>> need that?  We don't choose defaults that overprovision small
>> installations by 5X or 10X anywhere else, so why here?

> Those numbers are coming from f6d6d29, which points to this thread at
> its root:
> https://www.postgresql.org/message-id/CABUevEwfV7zDutescm2PHGvsJdYA0RWHFMTRGhwrJPGgSbzZDQ%40mail.gmail.com

> The number of max_wal_senders came out as a consensus because those are
> cheap to enable, now the number came out by itself.  I am not seeing on
> the thread any specific reason behind.

I don't doubt that the amount of shared memory involved is negligible,
but I'm less sure that there's no impact at all from overprovisioning 
max_wal_senders.  What I see in the code is a lot of places iterating
over each walsender slot and taking a spinlock on each slot, whether
or not the slot is used (or ever has been used).  Are we sure that
none of those places are performance hot spots?

AFAICS from a quick spin through the above-mentioned thread, there
was little discussion of the exact value to set max_wal_senders to,
and no performance testing of the point.

regards, tom lane



Re: Server won't start with fallback setting by initdb.

2018-03-04 Thread Michael Paquier
On Sun, Mar 04, 2018 at 03:31:31PM -0500, Tom Lane wrote:
> Then, seeing that the factory defaults are ReservedBackends = 3 and
> max_wal_senders = 10, something's got to give; there's no way that
> max_connections = 10 can work with those.  But what I would argue is that
> of those three choices, the least defensible one is max_wal_senders = 10.
> Where did that come from?  What fraction of real-world installations will
> need that?  We don't choose defaults that overprovision small
> installations by 5X or 10X anywhere else, so why here?

Those numbers are coming from f6d6d29, which points to this thread at
its root:
https://www.postgresql.org/message-id/CABUevEwfV7zDutescm2PHGvsJdYA0RWHFMTRGhwrJPGgSbzZDQ%40mail.gmail.com

The number of max_wal_senders came out as a consensus because those are
cheap to enable, now the number came out by itself.  I am not seeing on
the thread any specific reason behind.

> My proposal is to default max_wal_senders to perhaps 3, and leave
> initdb's logic alone.

I agree with you here.  That was actually my first counter proposal on
the matter, which is also conservative:
https://www.postgresql.org/message-id/CAB7nPqSFzsO6bEknEQ8yidwXOOUUeCc05NKsPQFhMWBFPv3Smg%40mail.gmail.com
--
Michael


signature.asc
Description: PGP signature


Re: Server won't start with fallback setting by initdb.

2018-03-04 Thread Tom Lane
Robert Haas  writes:
> On Fri, Feb 9, 2018 at 3:08 AM, Kyotaro HORIGUCHI
>  wrote:
>> I think that we can safely increase the fallback value to 20 with
>> which regtests are known not to fail.

> I propose an alternative fix: let's instead change the code like this:
> if (max_wal_senders > MaxConnections)

I think there is a bigger reason not to like that code.  If you look
a bit wider at the context, we are independently constraining
max_wal_senders and ReservedBackends:

if (ReservedBackends >= MaxConnections)
{
write_stderr("%s: superuser_reserved_connections must be less than 
max_connections\n", progname);
ExitPostmaster(1);
}
if (max_wal_senders >= MaxConnections)
{
write_stderr("%s: max_wal_senders must be less than max_connections\n", 
progname);
ExitPostmaster(1);
}

But this is insufficient to prevent trouble, because elsewhere we learn that

 * The last few connections slots are reserved for superusers. Although
 * replication connections currently require superuser privileges, we
 * don't allow them to consume the reserved slots, which are intended for
 * interactive use.

Therefore, the condition that actually ought to be getting enforced here
is either "ReservedBackends + max_wal_senders < MaxConnections", or
"ReservedBackends + max_wal_senders <= MaxConnections", depending on
whether you think it's appropriate to require at least one not-reserved-
for-superusers connection slot to remain available if the walsenders
slots are fully populated.

Then, seeing that the factory defaults are ReservedBackends = 3 and
max_wal_senders = 10, something's got to give; there's no way that
max_connections = 10 can work with those.  But what I would argue is that
of those three choices, the least defensible one is max_wal_senders = 10.
Where did that come from?  What fraction of real-world installations will
need that?  We don't choose defaults that overprovision small
installations by 5X or 10X anywhere else, so why here?

My proposal is to default max_wal_senders to perhaps 3, and leave
initdb's logic alone.

regards, tom lane



Re: Server won't start with fallback setting by initdb.

2018-02-28 Thread Robert Haas
On Fri, Feb 9, 2018 at 3:08 AM, Kyotaro HORIGUCHI
 wrote:
> I'm not sure such a case happens in the real world nowadays,
> initdb uses the fallback value of max_connections=10. But it is
> out of favor of server since it is not larger than
> max_wal_senders(10).
>
>> postgres: max_wal_senders must be less than max_connections
>
> I think that we can safely increase the fallback value to 20 with
> which regtests are known not to fail. I believe that is
> preferable than explicitly reducing max_wal_senders in the
> generated config file. I confirmed that tegtest won't fail with
> the value. (Except with permanent failure of dynamic shared
> memory)

I propose an alternative fix: let's instead change the code like this:

if (max_wal_senders > MaxConnections)
{
write_stderr("%s: max_wal_senders may not be more than
max_connections\n", progname);
ExitPostmaster(1);
}

That way, the behavior would match the documentation, which says:

"WAL sender processes count towards the total number of connections,
so the parameter cannot be set higher than max_connections."

Alternatively, we could change the documentation to match the code and
then do this.  But it's not good that the documentation and the code
don't agree on whether equal values are acceptable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Server won't start with fallback setting by initdb.

2018-02-27 Thread Michael Paquier
On Wed, Feb 14, 2018 at 10:08:06AM +0900, Kyotaro HORIGUCHI wrote:
> Definitely. The another rationale for the value is that regtest
> fails with the numbers less than 20. So it's not 11 but
> 20. Currently regtest should succeed with that number of
> connections as written in parallel_schedule and I've read (in
> Robert's mail, but I haven't confirmed) that tests for parallel
> scans are designed to use up to 20 connections.

I just noticed, but this thread in registered in next CF, so I have
switched this patch as ready for committer.  In the documentation,
guc-max-connections (config.sgml) mentions that the default value of
max_connections is typically 100, but it could be less as determined by
initdb.  Do we want to specify as well that initdb would use 100 as
upper bound and 20 as lower bound when it does its evaluation?  I would
think that this is not worth mentioning in the docs but as initdb is
directly mentioned..
--
Michael


signature.asc
Description: PGP signature


Re: Server won't start with fallback setting by initdb.

2018-02-13 Thread Kyotaro HORIGUCHI
Hello,

At Mon, 12 Feb 2018 22:28:15 +0900, Michael Paquier  wrote 
in <20180212132815.gb18...@paquier.xyz>
> On Fri, Feb 09, 2018 at 05:08:23PM +0900, Kyotaro HORIGUCHI wrote:
> > > postgres: max_wal_senders must be less than max_connections
> > 
> > I think that we can safely increase the fallback value to 20 with
> > which regtests are known not to fail. I believe that is
> > preferable than explicitly reducing max_wal_senders in the
> > generated config file. I confirmed that tegtest won't fail with
> > the value. (Except with permanent failure of dynamic shared
> > memory)
> 
> I would vote for just removing the minimum check at 10 connections as
> you do.  It is not worth the code complication in initdb to decrease
> max_wal_senders if max_connections is set to 10, which does not happen

Definitely. The another rationale for the value is that regtest
fails with the numbers less than 20. So it's not 11 but
20. Currently regtest should succeed with that number of
connections as written in parallel_schedule and I've read (in
Robert's mail, but I haven't confirmed) that tests for parallel
scans are designed to use up to 20 connections.

> even on definitely-not-decent hardware of those days like a RPI
> (memories from my hamster, RIP, which set max_connections to 100). 

I believe it is the minimal box where anyone casulally try to run
PostgreSQL as is.

regareds,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Server won't start with fallback setting by initdb.

2018-02-12 Thread Michael Paquier
On Fri, Feb 09, 2018 at 05:08:23PM +0900, Kyotaro HORIGUCHI wrote:
> > postgres: max_wal_senders must be less than max_connections
> 
> I think that we can safely increase the fallback value to 20 with
> which regtests are known not to fail. I believe that is
> preferable than explicitly reducing max_wal_senders in the
> generated config file. I confirmed that tegtest won't fail with
> the value. (Except with permanent failure of dynamic shared
> memory)

I would vote for just removing the minimum check at 10 connections as
you do.  It is not worth the code complication in initdb to decrease
max_wal_senders if max_connections is set to 10, which does not happen
even on definitely-not-decent hardware of those days like a RPI
(memories from my hamster, RIP, which set max_connections to 100). 
--
Michael


signature.asc
Description: PGP signature