Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-12-01 Thread Ashutosh Bapat
>
> I also find others's ideas woth considering -- WAL-logging the stats files, 
> type-specific stats files, etc. -- but I'm afraid those ideas would only be 
> employed in a new major release, not in released versions.  I'm asking for a 
> remedy for a user (and potential users) who use older releases.  And, I don't 
> yet understand why patch would make the situation for existing users, and why 
> stop writing files during immediate/abnormal shutdown requires other efforts.
>
I agree with this. I don't think we are going to change stats file
management in older versions. Hence after every crash recovery they
will be thrown away. In that case, why should stats collector in the
older versions wait for those files to be flushed to the disk? I
think, we should apply patch to shut down stats collector immediately
to the older versions and implement the stats file management solution
to the head. Why isn't that a feasible option?

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-26 Thread Robert Haas
On Thu, Nov 24, 2016 at 12:41 AM, Tsunakawa, Takayuki
 wrote:
> From: pgsql-hackers-ow...@postgresql.org
>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tom Lane
>> Robert Haas  writes:
>> > I agree.  However, in many cases, the major cost of a fast shutdown is
>> > getting the dirty data already in the operating system buffers down to
>> > disk, not in writing out shared_buffers itself.  The latter is
>> > probably a single-digit number of gigabytes, or maybe double-digit.
>> > The former might be a lot more, and the write of the pgstat file may
>> > back up behind it.  I've seen cases where an 8kB buffered write from
>> > Postgres takes tens of seconds to complete because the OS buffer cache
>> > is already saturated with dirty data, and the stats files could easily
>> > be a lot more than that.
>>
>> I think this is mostly FUD, because we don't fsync the stats files.  Maybe
>> we should, but we don't today.  So even if we have managed to get the system
>> into a state where physical writes are heavily backlogged, that's not a
>> reason to assume that the stats collector will be unable to do its thing
>> promptly.  All it has to do is push a relatively small amount of data into
>> kernel buffers.
>
> I'm sorry for my late reply, yesterday was a national holiday in Japan.
>
> It's not FUD.  I understand you hit the slow stats file write problem during 
> some regression test.  You said it took 57 seconds to write the stats file 
> during the postmaster shutdown.  That caused pg_ctl stop to fail due to its 
> 60 second timeout.  Even the regression test environment suffered from the 
> trouble.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-23 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tom Lane
> Maybe a workable compromise would be to leave the file present, and have
> the stats collector re-write it every (say) five minutes.  Then I'd be okay
> with having an immediate shutdown skip writing the file; you'd be losing
> up to five minutes' worth of activity, but not going completely nuts.  So
> the stats collector's normal activities would include writing the temp file
> on-demand and the permanent file on a timed cycle.
> 
> The other components of the fix (deleting on PITR rewind or stats collector
> crash) would remain the same.

The manual says:

"Also, the collector itself emits a new report at most once per 
PGSTAT_STAT_INTERVAL milliseconds (500 ms unless altered while building the 
server). So the displayed information lags behind actual activity."

Doesn't this mean that the stats collector writes files in pg_stat_tmp/ every 
500ms?  If true, how about just moving those files into appropriate locations 
during recovery, instead of removing the files?

I also find others's ideas woth considering -- WAL-logging the stats files, 
type-specific stats files, etc. -- but I'm afraid those ideas would only be 
employed in a new major release, not in released versions.  I'm asking for a 
remedy for a user (and potential users) who use older releases.  And, I don't 
yet understand why patch would make the situation for existing users, and why 
stop writing files during immediate/abnormal shutdown requires other efforts.

Regards
Takayuki Tsunakawa






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-23 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tom Lane
> Robert Haas  writes:
> > I agree.  However, in many cases, the major cost of a fast shutdown is
> > getting the dirty data already in the operating system buffers down to
> > disk, not in writing out shared_buffers itself.  The latter is
> > probably a single-digit number of gigabytes, or maybe double-digit.
> > The former might be a lot more, and the write of the pgstat file may
> > back up behind it.  I've seen cases where an 8kB buffered write from
> > Postgres takes tens of seconds to complete because the OS buffer cache
> > is already saturated with dirty data, and the stats files could easily
> > be a lot more than that.
> 
> I think this is mostly FUD, because we don't fsync the stats files.  Maybe
> we should, but we don't today.  So even if we have managed to get the system
> into a state where physical writes are heavily backlogged, that's not a
> reason to assume that the stats collector will be unable to do its thing
> promptly.  All it has to do is push a relatively small amount of data into
> kernel buffers.

I'm sorry for my late reply, yesterday was a national holiday in Japan.

It's not FUD.  I understand you hit the slow stats file write problem during 
some regression test.  You said it took 57 seconds to write the stats file 
during the postmaster shutdown.  That caused pg_ctl stop to fail due to its 60 
second timeout.  Even the regression test environment suffered from the trouble.

Regards
Takayuki Tsunakawa






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-23 Thread Tom Lane
Tomas Vondra  writes:
> This would also reduce the amount of data that we need to write to WAL, 
> although I'm not sure the amount is actually a problem. I've seen 
> instances with ~500MB stat files, but those were instances with hundreds 
> of databases, each with many thousands of objects.

Hm, was this before we split up the stats file per-database?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-23 Thread Tomas Vondra

On 11/23/2016 12:24 PM, Michael Paquier wrote:

On Wed, Nov 23, 2016 at 6:15 PM, Magnus Hagander  wrote:

There's also the consideration of what to do with stats *on the standby*. If
we WAL log the stats file, then when it replays onthe standby, the stats
there will be overwritten. And stats like number of index vs seq scans on
the standby are still interesting and would be lost.


Perhaps it would make sense to separate the stat files by type then?
The action taken for each file depends on its type.



That seems reasonable. There are two types of stats - usage statistics 
(number of index scans etc.), which we probably don't need on standby, 
and statistics that we use to drive autovacuum.


This would also reduce the amount of data that we need to write to WAL, 
although I'm not sure the amount is actually a problem. I've seen 
instances with ~500MB stat files, but those were instances with hundreds 
of databases, each with many thousands of objects.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-23 Thread Michael Paquier
On Wed, Nov 23, 2016 at 6:15 PM, Magnus Hagander  wrote:
> There's also the consideration of what to do with stats *on the standby*. If
> we WAL log the stats file, then when it replays onthe standby, the stats
> there will be overwritten. And stats like number of index vs seq scans on
> the standby are still interesting and would be lost.

Perhaps it would make sense to separate the stat files by type then?
The action taken for each file depends on its type.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-23 Thread Magnus Hagander
On Tue, Nov 22, 2016 at 10:26 PM, Alvaro Herrera 
wrote:

> Andres Freund wrote:
> > On 2016-11-22 16:15:58 -0500, Tom Lane wrote:
>
> > > Maybe a workable compromise would be to leave the file present, and
> have
> > > the stats collector re-write it every (say) five minutes.  Then I'd be
> > > okay with having an immediate shutdown skip writing the file; you'd be
> > > losing up to five minutes' worth of activity, but not going completely
> > > nuts.  So the stats collector's normal activities would include writing
> > > the temp file on-demand and the permanent file on a timed cycle.
> > >
> > > The other components of the fix (deleting on PITR rewind or stats
> > > collector crash) would remain the same.
>
> +1
>
> > It could even make sense to WAL log the contents of the stats file at
> > checkpoint (or similar) time. Then it'd be sane after crash recovery,
> > including starting from an old base backup.  Loosing the information
> > about what to vacuum / analyze after a failover is quite problematic...
>
> An idea worth considering.  This would also mean the file is valid in a
> standby -- the lack of the file is currently a problem if you promote
> the standby, as I recall.  But the file is perhaps too large to write to
> WAL on every checkpoint.
>

There's also the consideration of what to do with stats *on the standby*.
If we WAL log the stats file, then when it replays onthe standby, the stats
there will be overwritten. And stats like number of index vs seq scans on
the standby are still interesting and would be lost.

That's not to say that I don't agree with the issues of failover vs
autovacuum. But let's try to find a way that doesn't hurt other things to
get there. But that could just be part of how we deal with the replay of it.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-22 Thread Alvaro Herrera
Andres Freund wrote:
> On 2016-11-22 16:15:58 -0500, Tom Lane wrote:

> > Maybe a workable compromise would be to leave the file present, and have
> > the stats collector re-write it every (say) five minutes.  Then I'd be
> > okay with having an immediate shutdown skip writing the file; you'd be
> > losing up to five minutes' worth of activity, but not going completely
> > nuts.  So the stats collector's normal activities would include writing
> > the temp file on-demand and the permanent file on a timed cycle.
> >
> > The other components of the fix (deleting on PITR rewind or stats
> > collector crash) would remain the same.

+1

> It could even make sense to WAL log the contents of the stats file at
> checkpoint (or similar) time. Then it'd be sane after crash recovery,
> including starting from an old base backup.  Loosing the information
> about what to vacuum / analyze after a failover is quite problematic...

An idea worth considering.  This would also mean the file is valid in a
standby -- the lack of the file is currently a problem if you promote
the standby, as I recall.  But the file is perhaps too large to write to
WAL on every checkpoint.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-22 Thread Andres Freund
On 2016-11-22 16:15:58 -0500, Tom Lane wrote:
> Alvaro Herrera  writes:
> > Well, the problem is that the stats data is not on disk while the system
> > is in operation, as far as I recall -- it's only in the collector's
> > local memory.  On shutdown we tell it to write it down to a file, and on
> > startup we tell it to read it from the file and then delete it.  I think
> > the rationale for this is to avoid leaving a file with stale data on
> > disk while the system is running.
>
> Maybe a workable compromise would be to leave the file present, and have
> the stats collector re-write it every (say) five minutes.  Then I'd be
> okay with having an immediate shutdown skip writing the file; you'd be
> losing up to five minutes' worth of activity, but not going completely
> nuts.  So the stats collector's normal activities would include writing
> the temp file on-demand and the permanent file on a timed cycle.
>
> The other components of the fix (deleting on PITR rewind or stats
> collector crash) would remain the same.

It could even make sense to WAL log the contents of the stats file at
checkpoint (or similar) time. Then it'd be sane after crash recovery,
including starting from an old base backup.  Loosing the information
about what to vacuum / analyze after a failover is quite problematic...

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-22 Thread Robert Haas
On Tue, Nov 22, 2016 at 4:15 PM, Tom Lane  wrote:
> Alvaro Herrera  writes:
>> Well, the problem is that the stats data is not on disk while the system
>> is in operation, as far as I recall -- it's only in the collector's
>> local memory.  On shutdown we tell it to write it down to a file, and on
>> startup we tell it to read it from the file and then delete it.  I think
>> the rationale for this is to avoid leaving a file with stale data on
>> disk while the system is running.
>
> Maybe a workable compromise would be to leave the file present, and have
> the stats collector re-write it every (say) five minutes.  Then I'd be
> okay with having an immediate shutdown skip writing the file; you'd be
> losing up to five minutes' worth of activity, but not going completely
> nuts.  So the stats collector's normal activities would include writing
> the temp file on-demand and the permanent file on a timed cycle.
>
> The other components of the fix (deleting on PITR rewind or stats
> collector crash) would remain the same.

Sounds great.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-22 Thread Tom Lane
Alvaro Herrera  writes:
> Well, the problem is that the stats data is not on disk while the system
> is in operation, as far as I recall -- it's only in the collector's
> local memory.  On shutdown we tell it to write it down to a file, and on
> startup we tell it to read it from the file and then delete it.  I think
> the rationale for this is to avoid leaving a file with stale data on
> disk while the system is running.

Maybe a workable compromise would be to leave the file present, and have
the stats collector re-write it every (say) five minutes.  Then I'd be
okay with having an immediate shutdown skip writing the file; you'd be
losing up to five minutes' worth of activity, but not going completely
nuts.  So the stats collector's normal activities would include writing
the temp file on-demand and the permanent file on a timed cycle.

The other components of the fix (deleting on PITR rewind or stats
collector crash) would remain the same.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-22 Thread Robert Haas
On Tue, Nov 22, 2016 at 3:52 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I agree.  However, in many cases, the major cost of a fast shutdown is
>> getting the dirty data already in the operating system buffers down to
>> disk, not in writing out shared_buffers itself.  The latter is
>> probably a single-digit number of gigabytes, or maybe double-digit.
>> The former might be a lot more, and the write of the pgstat file may
>> back up behind it.  I've seen cases where an 8kB buffered write from
>> Postgres takes tens of seconds to complete because the OS buffer cache
>> is already saturated with dirty data, and the stats files could easily
>> be a lot more than that.
>
> I think this is mostly FUD, because we don't fsync the stats files.  Maybe
> we should, but we don't today.  So even if we have managed to get the
> system into a state where physical writes are heavily backlogged, that's
> not a reason to assume that the stats collector will be unable to do its
> thing promptly.  All it has to do is push a relatively small amount of
> data into kernel buffers.

I don't believe that's automatically fast, if we're bumping up against
dirty_ratio.  However, suppose you're right.  Then what prompted the
original complaint?  The OP said "The problem here is that postmaster
took as long as 15 seconds to terminate after it had detected a
crashed backend."  It clearly WASN'T an indefinite hang as might have
occurred with the malloc-lock problem for which we implemented the
SIGKILL stuff.  So something during shutdown took a long time, but not
forever.  There's no convincing evidence I've seen that it has to have
been this particular thing, but I find it plausible, because normal
backends bail out without doing much of anything, and here we have a
process that is trying to continue doing work after having received
SIGQUIT.  If not this, then what?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-22 Thread Tom Lane
Andres Freund  writes:
> But I'm a bit confused too - does this make any sort of difference?
> Because the startup path for crash recovery is like this:
>   pgstat_reset_all();
> so it seems quite inconsequential whether we write out pgstat, because
> we're going to nuke it either way after an immediate shutdown?

The discussion is exactly about whether we shouldn't get rid of that,
rather than doing what's proposed in this patch.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-22 Thread Andres Freund
Hi,

On 2016-11-22 15:49:27 -0500, Robert Haas wrote:
> I think you are almost right.  When the server is running, there are
> files in pg_stat_tmp but not pg_stat; when it is shut down, there are
> files in pg_stat but not pg_stat_tmp.  Of course the data can never be
> ONLY in the collector's backend-local memory because then nobody else
> could read it.

> I don't actually really understand the reason for this distinction.

pg_stat_tmp commonly is placed on tmpfs/a ramdisk.


But I'm a bit confused too - does this make any sort of difference?
Because the startup path for crash recovery is like this:

void
StartupXLOG(void)
{
...
/* REDO */
if (InRecovery)
{
...
/*
 * Reset pgstat data, because it may be invalid after recovery.
 */
pgstat_reset_all();

so it seems quite inconsequential whether we write out pgstat, because
we're going to nuke it either way after an immediate shutdown?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-22 Thread Tom Lane
Robert Haas  writes:
> I agree.  However, in many cases, the major cost of a fast shutdown is
> getting the dirty data already in the operating system buffers down to
> disk, not in writing out shared_buffers itself.  The latter is
> probably a single-digit number of gigabytes, or maybe double-digit.
> The former might be a lot more, and the write of the pgstat file may
> back up behind it.  I've seen cases where an 8kB buffered write from
> Postgres takes tens of seconds to complete because the OS buffer cache
> is already saturated with dirty data, and the stats files could easily
> be a lot more than that.

I think this is mostly FUD, because we don't fsync the stats files.  Maybe
we should, but we don't today.  So even if we have managed to get the
system into a state where physical writes are heavily backlogged, that's
not a reason to assume that the stats collector will be unable to do its
thing promptly.  All it has to do is push a relatively small amount of
data into kernel buffers.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-22 Thread Robert Haas
On Tue, Nov 22, 2016 at 3:34 PM, Alvaro Herrera
 wrote:
>> OK, that's possible, but I'm not sure.  I think there are two separate
>> issues here.  One is whether we should nuke the stats file on
>> recovery, and the other is whether we should force a final write of
>> the stats file before agreeing to an immediate shutdown.  It seems to
>> me that the first one affects whether all of the counters go to zero,
>> and the second affects whether we lose a small amount of data from
>> just prior to the shutdown.  Right now, we are doing the first, so the
>> second is a waste.  If we decide to start doing the first, we can
>> independently decide whether to also do the second.
>
> Well, the problem is that the stats data is not on disk while the system
> is in operation, as far as I recall -- it's only in the collector's
> local memory.  On shutdown we tell it to write it down to a file, and on
> startup we tell it to read it from the file and then delete it.  I think
> the rationale for this is to avoid leaving a file with stale data on
> disk while the system is running.

/me tests.

I think you are almost right.  When the server is running, there are
files in pg_stat_tmp but not pg_stat; when it is shut down, there are
files in pg_stat but not pg_stat_tmp.  Of course the data can never be
ONLY in the collector's backend-local memory because then nobody else
could read it.

I don't actually really understand the reason for this distinction.
If it's important not to lose the data, then the current system is the
worst of all possible worlds, and it would be better to have only only
one file and atomically rename() a new one in over top of it from time
to time.  If we did that and also committed the proposed patch, it
would be only slightly worse than if we did only that.  Wouldn't it?

>> > Those writes are slow because of the concurrent activity.  If all
>> > backends just throw their hands in the air, no more writes come from
>> > them, so the OS is going to finish the writes pretty quickly (or at
>> > least empty enough of the caches so that the pgstat data fits); so
>> > neither (1) nor (3) should be terribly serious.  I agree that (2) is a
>> > problem, but it's not a problem for everyone.
>>
>> If the operating system buffer cache doesn't contain much dirty data,
>> then I agree.  But there is a large backlog of dirty data there, then
>> it might be quite slow.
>
> That's true, but if the system isn't crashing, then writing a bunch of
> pages would make room for the pgstat data to be written to the OS, which
> is enough (we request only a write, not a flush, as I recall).  So we
> don't need to wait for a very long period.

I'm not sure what you are saying here.  Of course, if the OS writes
pages to disk then there will be room in the buffer cache for more
dirty pages.  The issue is whether this will unduly delay shutdown.

>> > A fast shutdown is not all that fast -- it needs to write the whole
>> > contents of shared buffers down to disk, which may be enormous.
>> > Millions of times bigger than pgstat data.  So a fast shutdown is
>> > actually very slow in a large machine.  An immediate shutdown, even if
>> > it writes pgstat data, is still going to be much smaller in terms of
>> > what is written.
>>
>> I agree.  However, in many cases, the major cost of a fast shutdown is
>> getting the dirty data already in the operating system buffers down to
>> disk, not in writing out shared_buffers itself.  The latter is
>> probably a single-digit number of gigabytes, or maybe double-digit.
>> The former might be a lot more, and the write of the pgstat file may
>> back up behind it.  I've seen cases where an 8kB buffered write from
>> Postgres takes tens of seconds to complete because the OS buffer cache
>> is already saturated with dirty data, and the stats files could easily
>> be a lot more than that.
>
> In the default config, background flushing is invoked when memory is 10%
> dirty (dirty_background_ratio); foreground flushing is forced when
> memory is 40% dirty (dirty_ratio).  That means the pgstat process can
> dirty 30% additional memory before being forced to perform flushing.

There's absolutely no guarantee that we aren't already hard up against
that 40% threshold - or whatever it is - already.  Backends write data
all the time, and flushes only happen at checkpoint time.  Indeed, I'd
argue that if somebody is performing an immediate shutdown, the most
likely reason is that a fast shutdown is too slow.  And the most
likely reason for that is that the operating system buffer cache is
full of dirty data.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-22 Thread Alvaro Herrera
Robert Haas wrote:
> On Tue, Nov 22, 2016 at 1:37 PM, Alvaro Herrera
>  wrote:
> >> > Yes, I am, and I disagree with you.  The current decision on this point
> >> > was made ages ago, before autovacuum even existed let alone relied on
> >> > the stats for proper functioning.  The tradeoff you're saying you're
> >> > okay with is "we'll shut down a few seconds faster, but you're going
> >> > to have table bloat problems later because autovacuum won't know it
> >> > needs to do anything".  I wonder how many of the complaints we get
> >> > about table bloat are a consequence of people not realizing that
> >> > "pg_ctl stop -m immediate" is going to cost them.
> >>
> >> That would be useful information to have, but I bet the answer is "not
> >> that many".  Most people don't shut down their database very often;
> >> they're looking for continuous uptime.  It looks to me like autovacuum
> >> activity causes the statistics files to get refreshed at least once
> >> per autovacuum_naptime, which defaults to once a minute, so on the
> >> average we're talking about the loss of perhaps 30 seconds worth of
> >> statistics.
> >
> > I think you're misunderstanding how this works.  Losing that file
> > doesn't lose just the final 30 seconds worth of data -- it loses
> > *everything*, and every counter goes back to zero.  So it's not a few
> > parts-per-million, it loses however many millions there were.
> 
> OK, that's possible, but I'm not sure.  I think there are two separate
> issues here.  One is whether we should nuke the stats file on
> recovery, and the other is whether we should force a final write of
> the stats file before agreeing to an immediate shutdown.  It seems to
> me that the first one affects whether all of the counters go to zero,
> and the second affects whether we lose a small amount of data from
> just prior to the shutdown.  Right now, we are doing the first, so the
> second is a waste.  If we decide to start doing the first, we can
> independently decide whether to also do the second.

Well, the problem is that the stats data is not on disk while the system
is in operation, as far as I recall -- it's only in the collector's
local memory.  On shutdown we tell it to write it down to a file, and on
startup we tell it to read it from the file and then delete it.  I think
the rationale for this is to avoid leaving a file with stale data on
disk while the system is running.

> > Those writes are slow because of the concurrent activity.  If all
> > backends just throw their hands in the air, no more writes come from
> > them, so the OS is going to finish the writes pretty quickly (or at
> > least empty enough of the caches so that the pgstat data fits); so
> > neither (1) nor (3) should be terribly serious.  I agree that (2) is a
> > problem, but it's not a problem for everyone.
> 
> If the operating system buffer cache doesn't contain much dirty data,
> then I agree.  But there is a large backlog of dirty data there, then
> it might be quite slow.

That's true, but if the system isn't crashing, then writing a bunch of
pages would make room for the pgstat data to be written to the OS, which
is enough (we request only a write, not a flush, as I recall).  So we
don't need to wait for a very long period.

> > A fast shutdown is not all that fast -- it needs to write the whole
> > contents of shared buffers down to disk, which may be enormous.
> > Millions of times bigger than pgstat data.  So a fast shutdown is
> > actually very slow in a large machine.  An immediate shutdown, even if
> > it writes pgstat data, is still going to be much smaller in terms of
> > what is written.
> 
> I agree.  However, in many cases, the major cost of a fast shutdown is
> getting the dirty data already in the operating system buffers down to
> disk, not in writing out shared_buffers itself.  The latter is
> probably a single-digit number of gigabytes, or maybe double-digit.
> The former might be a lot more, and the write of the pgstat file may
> back up behind it.  I've seen cases where an 8kB buffered write from
> Postgres takes tens of seconds to complete because the OS buffer cache
> is already saturated with dirty data, and the stats files could easily
> be a lot more than that.

In the default config, background flushing is invoked when memory is 10%
dirty (dirty_background_ratio); foreground flushing is forced when
memory is 40% dirty (dirty_ratio).  That means the pgstat process can
dirty 30% additional memory before being forced to perform flushing.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-22 Thread Robert Haas
On Tue, Nov 22, 2016 at 1:37 PM, Alvaro Herrera
 wrote:
>> > Yes, I am, and I disagree with you.  The current decision on this point
>> > was made ages ago, before autovacuum even existed let alone relied on
>> > the stats for proper functioning.  The tradeoff you're saying you're
>> > okay with is "we'll shut down a few seconds faster, but you're going
>> > to have table bloat problems later because autovacuum won't know it
>> > needs to do anything".  I wonder how many of the complaints we get
>> > about table bloat are a consequence of people not realizing that
>> > "pg_ctl stop -m immediate" is going to cost them.
>>
>> That would be useful information to have, but I bet the answer is "not
>> that many".  Most people don't shut down their database very often;
>> they're looking for continuous uptime.  It looks to me like autovacuum
>> activity causes the statistics files to get refreshed at least once
>> per autovacuum_naptime, which defaults to once a minute, so on the
>> average we're talking about the loss of perhaps 30 seconds worth of
>> statistics.
>
> I think you're misunderstanding how this works.  Losing that file
> doesn't lose just the final 30 seconds worth of data -- it loses
> *everything*, and every counter goes back to zero.  So it's not a few
> parts-per-million, it loses however many millions there were.

OK, that's possible, but I'm not sure.  I think there are two separate
issues here.  One is whether we should nuke the stats file on
recovery, and the other is whether we should force a final write of
the stats file before agreeing to an immediate shutdown.  It seems to
me that the first one affects whether all of the counters go to zero,
and the second affects whether we lose a small amount of data from
just prior to the shutdown.  Right now, we are doing the first, so the
second is a waste.  If we decide to start doing the first, we can
independently decide whether to also do the second.

>> I also think that you're wildly overestimating the likelihood that
>> writing the stats file will be fast, because (1) anything that
>> involves writing to the disk can be very slow, either because there's
>> a lot of other write activity or because the disk is going bad, the
>> latter being actually a pretty common cause of emergency database
>> shutdowns, (2) the stats files can be quite large if the database
>> system contains hundreds of thousands or even millions of objects,
>> which is not all that infrequent, and (3) pgstat wait timeouts are
>> pretty common, which would not be the case if writing the file was
>> invariably fast (c.f. 75b48e1fff8a4dedd3ddd7b76f6360b5cc9bb741).
>
> Those writes are slow because of the concurrent activity.  If all
> backends just throw their hands in the air, no more writes come from
> them, so the OS is going to finish the writes pretty quickly (or at
> least empty enough of the caches so that the pgstat data fits); so
> neither (1) nor (3) should be terribly serious.  I agree that (2) is a
> problem, but it's not a problem for everyone.

If the operating system buffer cache doesn't contain much dirty data,
then I agree.  But there is a large backlog of dirty data there, then
it might be quite slow.

>> >> ...  Yeah, it's not good, but neither are the things that prompt
>> >> people to perform an immediate shutdown in the first place.
>> >
>> > Really?  I think many users think an immediate shutdown is just fine.
>>
>> Why would anybody ever perform an immediate shutdown rather than a
>> fast shutdown if a fast shutdown were fast?
>
> A fast shutdown is not all that fast -- it needs to write the whole
> contents of shared buffers down to disk, which may be enormous.
> Millions of times bigger than pgstat data.  So a fast shutdown is
> actually very slow in a large machine.  An immediate shutdown, even if
> it writes pgstat data, is still going to be much smaller in terms of
> what is written.

I agree.  However, in many cases, the major cost of a fast shutdown is
getting the dirty data already in the operating system buffers down to
disk, not in writing out shared_buffers itself.  The latter is
probably a single-digit number of gigabytes, or maybe double-digit.
The former might be a lot more, and the write of the pgstat file may
back up behind it.  I've seen cases where an 8kB buffered write from
Postgres takes tens of seconds to complete because the OS buffer cache
is already saturated with dirty data, and the stats files could easily
be a lot more than that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-22 Thread Alvaro Herrera
Robert Haas wrote:
> On Tue, Nov 22, 2016 at 12:54 PM, Tom Lane  wrote:
> > Robert Haas  writes:
> >> But that's not what is at issue here.  The issue is whether, when
> >> asked to exit immediately, all processes should exit immediately, or
> >> whether it would be better for all processes except one to exit
> >> immediately and the last one exit non-immediately.  In other words,
> >> when the user asks for an immediate shutdown, do they really mean it,
> >> or are they OK taking time to do some other stuff first?
> >
> > Peter already gave the response to that, which is that users do not
> > expect an immediate shutdown to have permanent harmful effects.
> > It doesn't have such effects so far as the SQL data goes; why is it
> > okay to blow away statistical data?
> 
> You're doing that anyway.  The backends aren't going to send any
> accumulated but unsent statistics to the stats collector before
> exiting; they're just going to exit.

Sure.  That loses a few counts, but as you argue below, it's just "a few
parts per million".

> > Yes, I am, and I disagree with you.  The current decision on this point
> > was made ages ago, before autovacuum even existed let alone relied on
> > the stats for proper functioning.  The tradeoff you're saying you're
> > okay with is "we'll shut down a few seconds faster, but you're going
> > to have table bloat problems later because autovacuum won't know it
> > needs to do anything".  I wonder how many of the complaints we get
> > about table bloat are a consequence of people not realizing that
> > "pg_ctl stop -m immediate" is going to cost them.
> 
> That would be useful information to have, but I bet the answer is "not
> that many".  Most people don't shut down their database very often;
> they're looking for continuous uptime.  It looks to me like autovacuum
> activity causes the statistics files to get refreshed at least once
> per autovacuum_naptime, which defaults to once a minute, so on the
> average we're talking about the loss of perhaps 30 seconds worth of
> statistics.

I think you're misunderstanding how this works.  Losing that file
doesn't lose just the final 30 seconds worth of data -- it loses
*everything*, and every counter goes back to zero.  So it's not a few
parts-per-million, it loses however many millions there were.

> I also think that you're wildly overestimating the likelihood that
> writing the stats file will be fast, because (1) anything that
> involves writing to the disk can be very slow, either because there's
> a lot of other write activity or because the disk is going bad, the
> latter being actually a pretty common cause of emergency database
> shutdowns, (2) the stats files can be quite large if the database
> system contains hundreds of thousands or even millions of objects,
> which is not all that infrequent, and (3) pgstat wait timeouts are
> pretty common, which would not be the case if writing the file was
> invariably fast (c.f. 75b48e1fff8a4dedd3ddd7b76f6360b5cc9bb741).

Those writes are slow because of the concurrent activity.  If all
backends just throw their hands in the air, no more writes come from
them, so the OS is going to finish the writes pretty quickly (or at
least empty enough of the caches so that the pgstat data fits); so
neither (1) nor (3) should be terribly serious.  I agree that (2) is a
problem, but it's not a problem for everyone.

> >> ...  Yeah, it's not good, but neither are the things that prompt
> >> people to perform an immediate shutdown in the first place.
> >
> > Really?  I think many users think an immediate shutdown is just fine.
> 
> Why would anybody ever perform an immediate shutdown rather than a
> fast shutdown if a fast shutdown were fast?

A fast shutdown is not all that fast -- it needs to write the whole
contents of shared buffers down to disk, which may be enormous.
Millions of times bigger than pgstat data.  So a fast shutdown is
actually very slow in a large machine.  An immediate shutdown, even if
it writes pgstat data, is still going to be much smaller in terms of
what is written.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-22 Thread Robert Haas
On Tue, Nov 22, 2016 at 12:54 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> But that's not what is at issue here.  The issue is whether, when
>> asked to exit immediately, all processes should exit immediately, or
>> whether it would be better for all processes except one to exit
>> immediately and the last one exit non-immediately.  In other words,
>> when the user asks for an immediate shutdown, do they really mean it,
>> or are they OK taking time to do some other stuff first?
>
> Peter already gave the response to that, which is that users do not
> expect an immediate shutdown to have permanent harmful effects.
> It doesn't have such effects so far as the SQL data goes; why is it
> okay to blow away statistical data?

You're doing that anyway.  The backends aren't going to send any
accumulated but unsent statistics to the stats collector before
exiting; they're just going to exit.

>> You're arguing that preserving the stats file is more important than
>> timely shutdown, but I don't buy it.
>
> Yes, I am, and I disagree with you.  The current decision on this point
> was made ages ago, before autovacuum even existed let alone relied on
> the stats for proper functioning.  The tradeoff you're saying you're
> okay with is "we'll shut down a few seconds faster, but you're going
> to have table bloat problems later because autovacuum won't know it
> needs to do anything".  I wonder how many of the complaints we get
> about table bloat are a consequence of people not realizing that
> "pg_ctl stop -m immediate" is going to cost them.

That would be useful information to have, but I bet the answer is "not
that many".  Most people don't shut down their database very often;
they're looking for continuous uptime.  It looks to me like autovacuum
activity causes the statistics files to get refreshed at least once
per autovacuum_naptime, which defaults to once a minute, so on the
average we're talking about the loss of perhaps 30 seconds worth of
statistics.  What percentage of database activity occurs within 30
second of an immediate shutdown?  For a hypothetical installation
where the DBA does an immediate shutdown at four randomly-chosen times
each day, the answer is "less than 0.2%".  In real installations, the
time between immediate shutdown is likely to be weeks or months, and
so the answer is perhaps two or three orders of magnitude less than
that.  The loss of a few (or even a few dozen) parts-per-million of
statistics data shouldn't make any material difference.

To make this add up to a significant loss, you have to suppose that
there was a particularly large transaction in flight just before the
crash.  But it can't have been in flight at the moment of the crash,
because then the statistics message would not then have been sent.  So
it has to be the case that some really big transaction ended and then
just after that the DBA did an immediate shutdown.  I'm not arguing
that it couldn't happen, but it's a pretty narrow target to be aiming
at.

I also think that you're wildly overestimating the likelihood that
writing the stats file will be fast, because (1) anything that
involves writing to the disk can be very slow, either because there's
a lot of other write activity or because the disk is going bad, the
latter being actually a pretty common cause of emergency database
shutdowns, (2) the stats files can be quite large if the database
system contains hundreds of thousands or even millions of objects,
which is not all that infrequent, and (3) pgstat wait timeouts are
pretty common, which would not be the case if writing the file was
invariably fast (c.f. 75b48e1fff8a4dedd3ddd7b76f6360b5cc9bb741).

>> ...  Yeah, it's not good, but neither are the things that prompt
>> people to perform an immediate shutdown in the first place.
>
> Really?  I think many users think an immediate shutdown is just fine.

Why would anybody ever perform an immediate shutdown rather than a
fast shutdown if a fast shutdown were fast?  Now you've incurred the
overhead of a crash recovery for no gain.  In my experience, people
perform immediate shutdowns mostly when they try something else and it
doesn't work.  (Other reasons: They are Oracle users who think that
immediate will do what it does on Oracle, namely what we call a fast
shutdown; or they don't want to wait for the shutdown checkpoint.)

>> The patch that
>> you claim moots this one was inspired by immediate shutdowns taking
>> too long, and that patch enjoyed pretty broad support IIRC.
>
> I think that's historical revisionism.  The commit message for 82233ce7e
> says "(This might happen, for example, if a backend gets tangled trying
> to malloc() due to gettext(), as in an example illustrated by MauMau.)".
> There is absolutely nothing in it about people being dissatisfied with
> the shutdown timing in normal cases; rather, it was done to prevent
> cases where shutdown failed to happen at all.

OK, you're probably right about 

Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-22 Thread Tom Lane
Robert Haas  writes:
> But that's not what is at issue here.  The issue is whether, when
> asked to exit immediately, all processes should exit immediately, or
> whether it would be better for all processes except one to exit
> immediately and the last one exit non-immediately.  In other words,
> when the user asks for an immediate shutdown, do they really mean it,
> or are they OK taking time to do some other stuff first?

Peter already gave the response to that, which is that users do not
expect an immediate shutdown to have permanent harmful effects.
It doesn't have such effects so far as the SQL data goes; why is it
okay to blow away statistical data?

> You're arguing that preserving the stats file is more important than
> timely shutdown, but I don't buy it.

Yes, I am, and I disagree with you.  The current decision on this point
was made ages ago, before autovacuum even existed let alone relied on
the stats for proper functioning.  The tradeoff you're saying you're
okay with is "we'll shut down a few seconds faster, but you're going
to have table bloat problems later because autovacuum won't know it
needs to do anything".  I wonder how many of the complaints we get
about table bloat are a consequence of people not realizing that
"pg_ctl stop -m immediate" is going to cost them.

> ...  Yeah, it's not good, but neither are the things that prompt
> people to perform an immediate shutdown in the first place.

Really?  I think many users think an immediate shutdown is just fine.

> The patch that
> you claim moots this one was inspired by immediate shutdowns taking
> too long, and that patch enjoyed pretty broad support IIRC.

I think that's historical revisionism.  The commit message for 82233ce7e
says "(This might happen, for example, if a backend gets tangled trying
to malloc() due to gettext(), as in an example illustrated by MauMau.)".
There is absolutely nothing in it about people being dissatisfied with
the shutdown timing in normal cases; rather, it was done to prevent
cases where shutdown failed to happen at all.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-22 Thread Robert Haas
On Tue, Nov 22, 2016 at 10:59 AM, Tom Lane  wrote:
> It's already the case that the pgstats code writes the stats data under a
> temporary file name and then renames it into place atomically.  So the
> prospects for corrupt data are not large, and I do not think that the
> existing removal behavior was intended to prevent that.  Rather, the
> concern was that if you do a point-in-time recovery to someplace much
> earlier on the WAL timeline, the stats file will be out of sync with
> what's now in your database.  That's a valid point, but deleting the
> stats file during *any* recovery seems like an overreaction.

But that's not what is at issue here.  The issue is whether, when
asked to exit immediately, all processes should exit immediately, or
whether it would be better for all processes except one to exit
immediately and the last one exit non-immediately.  In other words,
when the user asks for an immediate shutdown, do they really mean it,
or are they OK taking time to do some other stuff first?

I think that the charter of an immediate shutdown is to perform a
simulated crash.  We do our best to let clients know that we are going
down and then we go down.   We don't checkpoint, we don't do any other
operation that might take a lengthy period of time, we just SHUT DOWN.
You're arguing that preserving the stats file is more important than
timely shutdown, but I don't buy it.  We've never tried to do that in
the past and we've got no evidence that it's causing a problem for
anyone.  Yeah, it's not good, but neither are the things that prompt
people to perform an immediate shutdown in the first place.

On the other hand, there's plenty of evidence that slow shutdowns are
a big problem for users.  People use fast shutdown because smart
shutdown was too slow (never-ending), and we changed the default for
that reason.  I've repeatedly seen users who resorted to an immediate
shut down because a fast shutdown wasn't fast enough.  The patch that
you claim moots this one was inspired by immediate shutdowns taking
too long, and that patch enjoyed pretty broad support IIRC.  I think
it's fairly perverse to receive yet another complaint about shutdown
being slow and, instead of fixing it, go try to make sure that
slowness is baked in as a policy decision.

Immediate, adj. occurring or done at once; instant.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-22 Thread Tom Lane
"Tsunakawa, Takayuki"  writes:
> From: Tom Lane [mailto:t...@sss.pgh.pa.us]
>> The point I was trying to make is that I think the forced-removal behavior
>> is not desirable, and therefore committing a patch that makes it be graven
>> in stone is not desirable either.

> I totally agree that we should pursue the direction for escaping from the 
> complete loss of stats files.  Personally, I would like to combine that with 
> the idea of persistent performance diagnosis information for long-term 
> analysis (IIRC, someone proposed it.)  However, I don't think my patch will 
> make everyone forget about the problem of stats file loss during recovery.  
> The problem exists with or without my patch, and my patch doesn't have the 
> power to delute the importance of the problem.  If you are worried about 
> memory, we can add an entry for the problem in TODO list that Bruce-san is 
> maintaining.

> Or, maybe we can just stop removing the stats files during recovery by 
> keeping the files of previous generation and using it as the current one.  I 
> haven't seen how fresh the previous generation is (500ms ago?).  A bit older 
> might be better than nothing.

Freshness isn't the issue.  The stats file isn't there at all, in the
permanent stats directory, unless the collector takes the time to write
it before exiting.  Without that, we have unrecoverable loss of the stats
data.  Now, that isn't as bad as loss of the SQL data content, but it's
not good either.

It's already the case that the pgstats code writes the stats data under a
temporary file name and then renames it into place atomically.  So the
prospects for corrupt data are not large, and I do not think that the
existing removal behavior was intended to prevent that.  Rather, the
concern was that if you do a point-in-time recovery to someplace much
earlier on the WAL timeline, the stats file will be out of sync with
what's now in your database.  That's a valid point, but deleting the
stats file during *any* recovery seems like an overreaction.

The simplest solution I can think of is to delete the stats file when
doing a PITR operation, but not during simple crash recovery.  I've
not looked to see how hard it would be to do that, but it seems like
it should be a fairly minor logic tweak.  Maybe decide to do the removal
at the point where we intentionally stop following WAL someplace earlier
than its end.

Another angle we might take, independently of that, is to delete the
stats file if the stats collector process itself crashes.  This would
provide a recovery avenue if somehow we did have a stats file that
was corrupt enough to crash the collector.  And it would not matter
for post-startup crashes of the stats collector, because the file
would not be there anyway.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-22 Thread Ashutosh Bapat
On Tue, Nov 22, 2016 at 12:26 PM, Tsunakawa, Takayuki
 wrote:
> From: Tom Lane [mailto:t...@sss.pgh.pa.us]
>> The point I was trying to make is that I think the forced-removal behavior
>> is not desirable, and therefore committing a patch that makes it be graven
>> in stone is not desirable either.
>
> I totally agree that we should pursue the direction for escaping from the 
> complete loss of stats files.  Personally, I would like to combine that with 
> the idea of persistent performance diagnosis information for long-term 
> analysis (IIRC, someone proposed it.)  However, I don't think my patch will 
> make everyone forget about the problem of stats file loss during recovery.  
> The problem exists with or without my patch, and my patch doesn't have the 
> power to delute the importance of the problem.  If you are worried about 
> memory, we can add an entry for the problem in TODO list that Bruce-san is 
> maintaining.

I don't think Tom is worried about forgetting this. I think he is
worried that if we do the changes as suggested in 01... patch, we will
make it more difficult to change stats file behaviour later. Right now
we are throwing away statistics files at the time of crash recovery.
In case we want to change it tomorrow, we will have to fix the
existing problem with the half-written/stale stats files and then fix
not to zap those files at the time of crash recovery. In case we go
ahead with this patch, we will have more instances of creating
half-written/stale stats file, which will need to fixed.

>
> 9.4 change may be sufficient.  But I don't think I can proudly explain the 
> logic to a really severe customer.  I can't answer the question "Why does 
> PostgreSQL write files that will be deleted, even during 'immediate' 
> shutdown?  Why does PostgreSQL use 5 seconds for nothing?"
>
> Other children do nothing and exit immediately.  I believe they are behaving 
> correctly.

I agree, with this though.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-21 Thread Tsunakawa, Takayuki
From: Tom Lane [mailto:t...@sss.pgh.pa.us]
> The point I was trying to make is that I think the forced-removal behavior
> is not desirable, and therefore committing a patch that makes it be graven
> in stone is not desirable either.

I totally agree that we should pursue the direction for escaping from the 
complete loss of stats files.  Personally, I would like to combine that with 
the idea of persistent performance diagnosis information for long-term analysis 
(IIRC, someone proposed it.)  However, I don't think my patch will make 
everyone forget about the problem of stats file loss during recovery.  The 
problem exists with or without my patch, and my patch doesn't have the power to 
delute the importance of the problem.  If you are worried about memory, we can 
add an entry for the problem in TODO list that Bruce-san is maintaining.

Or, maybe we can just stop removing the stats files during recovery by keeping 
the files of previous generation and using it as the current one.  I haven't 
seen how fresh the previous generation is (500ms ago?).  A bit older might be 
better than nothing.

> The larger picture here is that Takayuki-san wants us to commit a patch
> based on a customer's objection to 9.2's behavior, without any real evidence
> that the 9.4 change isn't a sufficient solution.  I've got absolutely zero
> sympathy for that "the stats collector might be stuck in an unkillable state"
> argument --- where's the evidence that the stats collector is any more prone
> to that than any other postmaster child?

9.4 change may be sufficient.  But I don't think I can proudly explain the 
logic to a really severe customer.  I can't answer the question "Why does 
PostgreSQL write files that will be deleted, even during 'immediate' shutdown?  
Why does PostgreSQL use 5 seconds for nothing?"

Other children do nothing and exit immediately.  I believe they are behaving 
correctly.

> And for that matter, if we are stuck because of a nonresponding NFS server,
> how is a quicker postmaster exit going to help anything?
> You're not going to be able to start a new postmaster if the data directory
> is on a nonresponsive server.

NFS server can also be configured for HA, and the new postmaster can start as 
soon as the NFS server completes failover.

> I'd be willing to entertain a proposal to make the 5-second limit adjustable,
> but I don't think we need entirely new behavior here.

Then, I'm at a loss what to do for the 9.2 user.

Regards
Takayuki Tsunakawa



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-21 Thread Tsunakawa, Takayuki
From: Robert Haas [mailto:robertmh...@gmail.com]
> So there are two questions here:
> 
> 1. Should we try to avoid having the stats collector write a stats file
> during an immediate shutdown?  The file will be removed anyway during crash
> recovery, so writing it is pointless.  I think you are right that 9.4's
> solution here is not perfect, because of the 5 second delay, and also because
> if the stats collector is stuck inside the kernel trying to write to the
> OS, it may be in a non-interruptible wait state where even SIGKILL has no
> immediate effect.  Anyway, it's stupid even from a performance point of
> view to waste time writing a file that we're just going to nuke.
> 
> 2. Should we close listen sockets sooner during an immediate shutdown?
>  I agree with Tom and Peter that this isn't a good idea.  People expect
> the sockets not to go away until the end - e.g. they use
> PQping() to test the server status, or they connect just to see what error
> they get - and the fact that a client application could hypothetically
> generate such a relentless stream of connection attempts that the dead-end
> backends thereby created slow down shutdown is not in my mind a sufficient
> reason to change the behavior.
> 
> So I think 001 should proceed and 002 should be rejected.

I'm happy with this conclusion, since I think 1 was the cause of slow shutdown, 
and 2 is just a hypothesis to pursue the completeness.  And I can understand 
the concern about PQping().

Regards
Takayuki Tsunakawa


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-21 Thread Tom Lane
Robert Haas  writes:
> 1. Should we try to avoid having the stats collector write a stats
> file during an immediate shutdown?  The file will be removed anyway
> during crash recovery, so writing it is pointless.

The point I was trying to make is that I think the forced-removal behavior
is not desirable, and therefore committing a patch that makes it be graven
in stone is not desirable either.

The larger picture here is that Takayuki-san wants us to commit a patch
based on a customer's objection to 9.2's behavior, without any real
evidence that the 9.4 change isn't a sufficient solution.  I've got
absolutely zero sympathy for that "the stats collector might be stuck in
an unkillable state" argument --- where's the evidence that the stats
collector is any more prone to that than any other postmaster child?
And for that matter, if we are stuck because of a nonresponding NFS
server, how is a quicker postmaster exit going to help anything?
You're not going to be able to start a new postmaster if the data
directory is on a nonresponsive server.

I'd be willing to entertain a proposal to make the 5-second limit
adjustable, but I don't think we need entirely new behavior here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-21 Thread Robert Haas
On Sun, Nov 20, 2016 at 10:20 PM, Tsunakawa, Takayuki
 wrote:
> The reasons why I proposed this patch are:
>
> * It happened in a highly mission-critical production system of a customer 
> who uses 9.2.
>
> * 9.4's solution is not perfect, because it wastes 5 seconds anyway, which is 
> unexpected for users.  The customer's requirement includes failover within 30 
> seconds, so 5 seconds can be seen as a risk.
> Plus, I'm worried about the possibility that the SIGKILLed process wouldn't 
> disappear if it's writing to a network storage like NFS.
>
> * And first of all, the immediate shutdown should shut the server down 
> immediately without doing anything heavy, as the name means.

So there are two questions here:

1. Should we try to avoid having the stats collector write a stats
file during an immediate shutdown?  The file will be removed anyway
during crash recovery, so writing it is pointless.  I think you are
right that 9.4's solution here is not perfect, because of the 5 second
delay, and also because if the stats collector is stuck inside the
kernel trying to write to the OS, it may be in a non-interruptible
wait state where even SIGKILL has no immediate effect.  Anyway, it's
stupid even from a performance point of view to waste time writing a
file that we're just going to nuke.

2. Should we close listen sockets sooner during an immediate shutdown?
 I agree with Tom and Peter that this isn't a good idea.  People
expect the sockets not to go away until the end - e.g. they use
PQping() to test the server status, or they connect just to see what
error they get - and the fact that a client application could
hypothetically generate such a relentless stream of connection
attempts that the dead-end backends thereby created slow down shutdown
is not in my mind a sufficient reason to change the behavior.

So I think 001 should proceed and 002 should be rejected.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-20 Thread Tsunakawa, Takayuki
From: Robert Haas [mailto:robertmh...@gmail.com]
> On Fri, Nov 18, 2016 at 4:12 PM, Tom Lane  wrote:
> > Alvaro Herrera  writes:
> >> Tom Lane wrote:
> >>> IMO it's not, and closer analysis says that this patch series is an
> >>> attempt to solve something we already fixed, better, in 9.4.
> >
> >> ... by the same patch submitter.
> >
> > [ confused ]  The commit log credits 82233ce7e to MauMau and yourself.
> 
> IIUC, MauMau = Tsunakawa Takayuki.

Yes, it's me.  I'm pleased that you remember me!

First, I understand that zapping the stats file during recovery can be a 
problem.  In fact, it's me who proposed adding a sentence in the manual that 
the stats file is reset after immediate shutdown.  I think addressing this 
problem is another topic in a new thread.

The reasons why I proposed this patch are:

* It happened in a highly mission-critical production system of a customer who 
uses 9.2.

* 9.4's solution is not perfect, because it wastes 5 seconds anyway, which is 
unexpected for users.  The customer's requirement includes failover within 30 
seconds, so 5 seconds can be seen as a risk.
Plus, I'm worried about the possibility that the SIGKILLed process wouldn't 
disappear if it's writing to a network storage like NFS.

* And first of all, the immediate shutdown should shut the server down 
immediately without doing anything heavy, as the name means.

So, I think this patch should also be applied to later releases.  The purpose 
of the patch in 9.4 was to avoid PostgreSQL's bug, where the ereport() in 
quickdie() gets stuck waiting for malloc()'s lock to be released.

Regards
Takayuki Tsunakawa


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-18 Thread Tom Lane
Alvaro Herrera  writes:
> Robert Haas wrote:
>> IIUC, MauMau = Tsunakawa Takayuki.

> Right, 
> https://www.postgresql.org/message-id/964413269E3A41409EA53EE0D813E48C@tunaPC

Ah!  I'd forgotten.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-18 Thread Alvaro Herrera
Robert Haas wrote:
> On Fri, Nov 18, 2016 at 4:12 PM, Tom Lane  wrote:
> > Alvaro Herrera  writes:
> >> Tom Lane wrote:
> >>> IMO it's not, and closer analysis says that this patch series is an
> >>> attempt to solve something we already fixed, better, in 9.4.
> >
> >> ... by the same patch submitter.
> >
> > [ confused ]  The commit log credits 82233ce7e to MauMau and yourself.
> 
> IIUC, MauMau = Tsunakawa Takayuki.

Right, 
https://www.postgresql.org/message-id/964413269E3A41409EA53EE0D813E48C@tunaPC

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-18 Thread Robert Haas
On Fri, Nov 18, 2016 at 4:12 PM, Tom Lane  wrote:
> Alvaro Herrera  writes:
>> Tom Lane wrote:
>>> IMO it's not, and closer analysis says that this patch series is an
>>> attempt to solve something we already fixed, better, in 9.4.
>
>> ... by the same patch submitter.
>
> [ confused ]  The commit log credits 82233ce7e to MauMau and yourself.

IIUC, MauMau = Tsunakawa Takayuki.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-18 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> IMO it's not, and closer analysis says that this patch series is an
>> attempt to solve something we already fixed, better, in 9.4.

> ... by the same patch submitter.

[ confused ]  The commit log credits 82233ce7e to MauMau and yourself.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-18 Thread Alvaro Herrera
Tom Lane wrote:
> Peter Eisentraut  writes:
> > On 11/14/16 4:38 AM, Ashutosh Bapat wrote:
> >> The patch 02_close_listen... closes the listen sockets
> >> explicitly when it's known that postmaster is going to stop all the
> >> children and then die. I have tried to see, if there's a possibility
> >> that it closes the listen sockets but do not actually die, thus
> >> causing a server which doesn't accept any connections and doesn't die.
> >> But I have not found that possibility.
> 
> > I can see the point of this, but I'm not sure whether this is always a
> > good idea.
> 
> IMO it's not, and closer analysis says that this patch series is an
> attempt to solve something we already fixed, better, in 9.4.

... by the same patch submitter.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-18 Thread Peter Eisentraut
On 11/18/16 12:00 PM, Tom Lane wrote:
> My feeling is that 82233ce7e has obsoleted all of the proposals made so
> far in this thread, and that we should reject them all.

Yes, it seems that very similar concerns were already addressed there.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-18 Thread Tom Lane
Peter Eisentraut  writes:
> On 11/14/16 4:38 AM, Ashutosh Bapat wrote:
>> The patch 02_close_listen... closes the listen sockets
>> explicitly when it's known that postmaster is going to stop all the
>> children and then die. I have tried to see, if there's a possibility
>> that it closes the listen sockets but do not actually die, thus
>> causing a server which doesn't accept any connections and doesn't die.
>> But I have not found that possibility.

> I can see the point of this, but I'm not sure whether this is always a
> good idea.

IMO it's not, and closer analysis says that this patch series is an
attempt to solve something we already fixed, better, in 9.4.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-18 Thread Tom Lane
I wrote:
> Peter Eisentraut  writes:
>> ISTM that this would change the "immediate shutdown" to not save stats
>> files anymore.  So far, all the shutdown modes are equivalent in terms
>> of how they preserve data and system state.  They differ only in when
>> the hard work happens.  This would be a deviation from that principle.

> There is that.  Up to now, an immediate shutdown request didn't cause
> any actual data loss, but now it would.

Oh, no, after rereading the thread I remember the point here: during
crash recovery, we intentionally zap the stats files on the grounds
that they might be broken, or might be out-of-sync with the recovery
stop point.  See pgstat_reset_all().

Now, you could certainly argue that that's an idea we should rethink;
we're throwing away probably-useful data because it *might* be wrong,
which seems rather pointless when it's known to be approximate anyway.
But if that stays, there's no very good reason not to short-circuit
writing of the files during immediate shutdown.

On the other side of the coin: I note that in 9.4 and up, the postmaster
has a 5-second patience limit before SIGKILL'ing child processes during a
crash or immediate shutdown (cf commit 82233ce7e).  That limit applies to
the stats collector too, and it seems to me that it addresses the real
problem here (ie, unbounded shutdown time) much better than the proposed
patch.  Note that the complaint that started this thread was against 9.2.

My feeling is that 82233ce7e has obsoleted all of the proposals made so
far in this thread, and that we should reject them all.  Instead we should
think about whether pgstat_reset_all should go away.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-18 Thread Peter Eisentraut
On 11/14/16 4:38 AM, Ashutosh Bapat wrote:
> The patch 02_close_listen... closes the listen sockets
> explicitly when it's known that postmaster is going to stop all the
> children and then die. I have tried to see, if there's a possibility
> that it closes the listen sockets but do not actually die, thus
> causing a server which doesn't accept any connections and doesn't die.
> But I have not found that possibility.

I can see the point of this, but I'm not sure whether this is always a
good idea.  Some people "monitor" postgres shutdown with PQping() or by
parsing the error messages that the client gets.  If we just close the
sockets as soon as possible, we lose all communication and can't tell
what's going on anymore.

If your HA system insists that the crashing server be completely shut
down before moving on, then maybe that can be refined somehow instead?

I haven't seen any analysis in this thread of how much of a problem this
really is.  For example, could we keep the socket open longer but reject
connection attempts faster in this state?

This patch only changes the behavior in the case of a crash or an
immediate shutdown, but not for the other shutdown modes.  I don't think
it is good to create different behaviors for different modes.

A long time ago, ClosePostmasterPorts() essentially had the job to close
all postmaster sockets.  Every single call thereof is in fact commented
with /* Close the postmaster's sockets */.  (Ancient postgres code uses
"port" as a more general term for socket or communication port.)  Now it
has accumulated other tasks so it is more of a
ClosePostmasterOnlyResourcesInChild().  So if we were to introduce a
ClosePostmasterSockets() somehow, we should adjust the naming and the
comments so that we don't have two similar-sounding functions with
confusing comments.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-18 Thread Tom Lane
Peter Eisentraut  writes:
> ISTM that this would change the "immediate shutdown" to not save stats
> files anymore.  So far, all the shutdown modes are equivalent in terms
> of how they preserve data and system state.  They differ only in when
> the hard work happens.  This would be a deviation from that principle.

There is that.  Up to now, an immediate shutdown request didn't cause
any actual data loss, but now it would.  Maybe that's a reason to reject
this whole concept.  (Upthread I was thinking that's a behavior we have
anyway, but really we don't: a look through pgstats.c shows that it will
never exit without attempting to write the stats, short of a crash of
the stats collector process itself.)

> Child processes don't distinguish between a SIGQUIT coming from a
> user-initiated immediate shutdown request and a crash-induced
> kill-everyone directive.  So there might not be a non-ugly way to
> address that.

It doesn't seem to me that it's a matter of whether the signaling is
adequate; we could fix that.  It's a matter of whether you're willing to
have "pg_ctl stop -m immediate" lose stats data.

Although the stats collector was originally conceived as optional,
we depend on it enough now for autovacuum that I'm not sure it'd be a
good thing to have a popularly-used shutdown mode that loses the stats.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-18 Thread Peter Eisentraut
On 9/27/16 11:07 PM, Tsunakawa, Takayuki wrote:
>> From: pgsql-hackers-ow...@postgresql.org
>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tom Lane
>> Allowing SIGQUIT to prompt fast shutdown of the stats collector seems sane,
>> though.  Try to make sure it doesn't leave partly-written stats files
>> behind.
> 
> The attached patch based on HEAD does this.  I'd like this to be back-patched 
> because one of our important customers uses 9.2.
> 
> I didn't remove partially written stat files on SIGQUIT for the following 
> reasons.  Is this OK?
> 
> 1. The recovery at the next restart will remove the stat files.
> 2. SIGQUIT processing should be as fast as possible.
> 3. If writing stats files took long due to the OS page cache flushing, 
> removing files might be forced to wait likewise.

ISTM that this would change the "immediate shutdown" to not save stats
files anymore.  So far, all the shutdown modes are equivalent in terms
of how they preserve data and system state.  They differ only in when
the hard work happens.  This would be a deviation from that principle.

Child processes don't distinguish between a SIGQUIT coming from a
user-initiated immediate shutdown request and a crash-induced
kill-everyone directive.  So there might not be a non-ugly way to
address that.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-14 Thread Ashutosh Bapat
On Mon, Nov 14, 2016 at 5:16 AM, Tsunakawa, Takayuki
 wrote:
> From: pgsql-hackers-ow...@postgresql.org
>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Ashutosh Bapat
>> I have changed some comments around this block. See attached patch.
>> Let me know if that looks good.
>
> Thanks, it looks good.
>
Thanks. The patch 02_close_listen... closes the listen sockets
explicitly when it's known that postmaster is going to stop all the
children and then die. I have tried to see, if there's a possibility
that it closes the listen sockets but do not actually die, thus
causing a server which doesn't accept any connections and doesn't die.
But I have not found that possibility.

I do not have any further comments about both the patches. None else
has volunteered to review the patch till now. So, marking the entry as
ready for committer.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-13 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Ashutosh Bapat
> I have changed some comments around this block. See attached patch.
> Let me know if that looks good.

Thanks, it looks good.

Regards
Takayuki Tsunakawa


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-11 Thread Ashutosh Bapat
On Mon, Nov 7, 2016 at 10:14 AM, Tsunakawa, Takayuki
 wrote:
> From: pgsql-hackers-ow...@postgresql.org
>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Ashutosh Bapat
>> I am not sure if following condition is a good idea in ServerLoop()
>> 1650 if (pmState == PM_WAIT_DEAD_END || ClosedSockets)
>>
>> There are no sockets to listen on, so select in the else condition is going
>> to sleep for timeout determined based on the sequence expected.
>> Just before we close sockets in pmdie() it sets AbortStartTime, which
>> determines the timeout for the sleep here. So, it doesn't make sense to
>> ignore it. Instead may be we should change the default 60s sleep to 100ms
>> sleep in DetermineSleepTime().
>
> That sounds better.  I modified cleaned ServerLoop() by pushing the existing 
> 100ms logic into DetermineSleepTime().

I have changed some comments around this block. See attached patch.
Let me know if that looks good.

>
>
>> While the postmaster is terminating children, a new connection request may
>> arrive. We should probably close listening sockets before terminating
>> children in pmdie().
>
> I moved ClosePostmasterSocket() call before terminating children in the 
> immediate shutdown case.  I didn't change the behavior of smart and fast 
> shutdown modes, because they may take a long time to complete due to 
> checkpointing etc.  Users will want to know what's happening during shutdown 
> or after pg_ctl stop times out, by getting the message "FATAL:  the database 
> system is shutting down" when they try to connect to the database.  The 
> immediate shutdown or crash should better be as fast as possible.

OK.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 24add74..52c0f46 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -265,20 +265,22 @@ static StartupStatusEnum StartupStatus = STARTUP_NOT_RUNNING;
 
 /* Startup/shutdown state */
 #define			NoShutdown		0
 #define			SmartShutdown	1
 #define			FastShutdown	2
 #define			ImmediateShutdown	3
 
 static int	Shutdown = NoShutdown;
 
 static bool FatalError = false; /* T if recovering from backend crash */
+static bool ClosedSockets = false; /* T if postmaster's listening sockets
+	  have been closed. */
 
 /*
  * We use a simple state machine to control startup, shutdown, and
  * crash recovery (which is rather like shutdown followed by startup).
  *
  * After doing all the postmaster initialization work, we enter PM_STARTUP
  * state and the startup process is launched. The startup process begins by
  * reading the control file and other preliminary initialization steps.
  * In a normal startup, or after crash recovery, the startup process exits
  * with exit code 0 and we switch to PM_RUN state.  However, archive recovery
@@ -372,20 +374,21 @@ static DNSServiceRef bonjour_sdref = NULL;
 
 /*
  * postmaster.c - function prototypes
  */
 static void CloseServerPorts(int status, Datum arg);
 static void unlink_external_pid_file(int status, Datum arg);
 static void getInstallationPaths(const char *argv0);
 static void checkDataDir(void);
 static Port *ConnCreate(int serverFd);
 static void ConnFree(Port *port);
+static void ClosePostmasterSockets(void);
 static void reset_shared(int port);
 static void SIGHUP_handler(SIGNAL_ARGS);
 static void pmdie(SIGNAL_ARGS);
 static void reaper(SIGNAL_ARGS);
 static void sigusr1_handler(SIGNAL_ARGS);
 static void startup_die(SIGNAL_ARGS);
 static void dummy_handler(SIGNAL_ARGS);
 static void StartupPacketTimeoutHandler(void);
 static void CleanupBackend(int pid, int exitstatus);
 static bool CleanupBackgroundWorker(int pid, int exitstatus);
@@ -1513,20 +1516,28 @@ checkDataDir(void)
  * background tasks handled by ServerLoop get done even when no requests are
  * arriving.  However, if there are background workers waiting to be started,
  * we don't actually sleep so that they are quickly serviced.  Other exception
  * cases are as shown in the code.
  */
 static void
 DetermineSleepTime(struct timeval * timeout)
 {
 	TimestampTz next_wakeup = 0;
 
+	/* In PM_WAIT_DEAD_END state, sleeping for 100ms seems reasonable. */
+	if (pmState == PM_WAIT_DEAD_END)
+	{
+		timeout->tv_sec = 0;
+		timeout->tv_usec = 10L;
+		return;
+	}
+
 	/*
 	 * Normal case: either there are no background workers at all, or we're in
 	 * a shutdown sequence (during which we ignore bgworkers altogether).
 	 */
 	if (Shutdown > NoShutdown ||
 		(!StartWorkerNeeded && !HaveCrashedWorker))
 	{
 		if (AbortStartTime != 0)
 		{
 			/* time left to abort; clamp to 0 in case it already expired */
@@ -1623,57 +1634,52 @@ ServerLoop(void)
 
 	last_lockfile_recheck_time = last_touch_time = time(NULL);
 
 	nSockets = initMasks();
 
 	for (;;)
 	{
 		fd_set		rmask;
 		int			selres;
 		time_t		now;
+		/* must set timeout each 

Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-06 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Ashutosh Bapat
> I am not sure if following condition is a good idea in ServerLoop()
> 1650 if (pmState == PM_WAIT_DEAD_END || ClosedSockets)
> 
> There are no sockets to listen on, so select in the else condition is going
> to sleep for timeout determined based on the sequence expected.
> Just before we close sockets in pmdie() it sets AbortStartTime, which
> determines the timeout for the sleep here. So, it doesn't make sense to
> ignore it. Instead may be we should change the default 60s sleep to 100ms
> sleep in DetermineSleepTime().

That sounds better.  I modified cleaned ServerLoop() by pushing the existing 
100ms logic into DetermineSleepTime().


> While the postmaster is terminating children, a new connection request may
> arrive. We should probably close listening sockets before terminating
> children in pmdie().

I moved ClosePostmasterSocket() call before terminating children in the 
immediate shutdown case.  I didn't change the behavior of smart and fast 
shutdown modes, because they may take a long time to complete due to 
checkpointing etc.  Users will want to know what's happening during shutdown or 
after pg_ctl stop times out, by getting the message "FATAL:  the database 
system is shutting down" when they try to connect to the database.  The 
immediate shutdown or crash should better be as fast as possible.


> Otherwise this patch looks good to me. It applies and compiles cleanly.
> make check-world doesn't show any failures.

Thank you for reviewing and testing.  The revised patch is attached.

Regards
Takayuki Tsunakawa



02_close_listen_ports_early_v2.patch
Description: 02_close_listen_ports_early_v2.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-04 Thread Ashutosh Bapat
On Tue, Oct 4, 2016 at 2:35 PM, Tsunakawa, Takayuki
 wrote:
> From: pgsql-hackers-ow...@postgresql.org
>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tom Lane
> FWIW, I'm pretty much -1 on messing with the timing of the socket close
>> actions.  I broke that once within recent memory, so maybe I'm gun-shy,
>> but I think that the odds of unpleasant side effects greatly outweigh any
>> likely benefit there.
>
> I couldn't find any relevant mails in pgsql-hackers.  I found no problem with 
> the attached patch.  Do you think this is OK?

Few comments on this patch,

I am not sure if following condition is a good idea in ServerLoop()
1650 if (pmState == PM_WAIT_DEAD_END || ClosedSockets)

There are no sockets to listen on, so select in the else condition is
going to sleep for timeout determined based on the sequence expected.
Just before we close sockets in pmdie() it sets AbortStartTime, which
determines the timeout for the sleep here. So, it doesn't make sense
to ignore it. Instead may be we should change the default 60s sleep to
100ms sleep in DetermineSleepTime().

If not, we need to at least update the comments to indicate the other
reason for not polling using select().
 * If we are in PM_WAIT_DEAD_END state, then we don't want to accept
 * any new connections, so we don't call select(), and just sleep.
 */
memcpy((char *) , (char *) , sizeof(fd_set));

-   if (pmState == PM_WAIT_DEAD_END)
+   if (pmState == PM_WAIT_DEAD_END || ClosedSockets)

While the postmaster is terminating children, a new connection request
may arrive. We should probably close listening sockets before
terminating children in pmdie().

Otherwise this patch looks good to me. It applies and compiles
cleanly. make check-world doesn't show any failures.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-10-27 Thread Tsunakawa, Takayuki
From: Ashutosh Bapat [mailto:ashutosh.ba...@enterprisedb.com]
> Ok. In that case, I think we shouldn't even call PG_SETMASK() similar to
> pgarch_exit(). Attached patch removes PG_SETMASK(). Let me know if it looks
> good.

It looks good.  Thanks.

Regards
Takayuki Tsunakawa


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-10-27 Thread Ashutosh Bapat
On Thu, Oct 27, 2016 at 7:29 AM, Tsunakawa, Takayuki
 wrote:
> From: pgsql-hackers-ow...@postgresql.org
>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Ashutosh Bapat
>> In pgstat_quickdie(), I think a call to sigaddset(, SIGQUIT) is
>> missing before PG_SETMASK(). Although there are some SIGQUIT handlers which
>> do not have that call. But I guess, it will be safer to have it.
>
> I didn't add it because pgstat_quickdie() just exits, like some other 
> postmaster children.  I thought those processes which are concerned about 
> their termination processing call sigaddset(SIGQUIT), so I went after the 
> processes who aren't.  Is this really necessary?
>

Ok. In that case, I think we shouldn't even call PG_SETMASK() similar
to pgarch_exit(). Attached patch removes PG_SETMASK(). Let me know if
it looks good.

>> Also, many other SIGQUIT handlers like bgworker_quickdie() call
>> on_exit_reset() followed by exit(2) instead of just exit(1) in
>> pgstat_quickdie(). Why is this difference?
>
> As Robert and Tom said, either exit(1) or exit(2) is OK because reaper() 
> handles non-zero exit code the same.

Yes, per reaper().
2955 /*
2956  * Was it the statistics collector?  If so, just try to start a new
2957  * one; no need to force reset of the rest of the system.
(If fail,
2958  * we'll try again in future cycles of the main loop.)
2959  */
2960 if (pid == PgStatPID)
2961 {
2962 PgStatPID = 0;
2963 if (!EXIT_STATUS_0(exitstatus))
2964 LogChildExit(LOG, _("statistics collector process"),
2965  pid, exitstatus);
2966 if (pmState == PM_RUN)
2967 PgStatPID = pgstat_start();
2968 continue;
2969 }


> Regarding on_proc_reset(), stats collector is not attached to the shared 
> memory and does not register on_proc_exit() callbacks.  These situations are 
> the same as the archiver process, so I followed it.
>

Right. I got confused because of on_shmem_exit() call in
pgstat_initialize. But that's per backend code and not executed within
stats collector.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 5112d6d..b160f4c 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -244,20 +244,21 @@ static instr_time total_func_time;
 
 /* --
  * Local function forward declarations
  * --
  */
 #ifdef EXEC_BACKEND
 static pid_t pgstat_forkexec(void);
 #endif
 
 NON_EXEC_STATIC void PgstatCollectorMain(int argc, char *argv[]) pg_attribute_noreturn();
+static void pgstat_quickdie(SIGNAL_ARGS);
 static void pgstat_exit(SIGNAL_ARGS);
 static void pgstat_beshutdown_hook(int code, Datum arg);
 static void pgstat_sighup_handler(SIGNAL_ARGS);
 
 static PgStat_StatDBEntry *pgstat_get_db_entry(Oid databaseid, bool create);
 static PgStat_StatTabEntry *pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry,
 	 Oid tableoid, bool create);
 static void pgstat_write_statsfiles(bool permanent, bool allDbs);
 static void pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent);
 static HTAB *pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep);
@@ -3694,27 +3695,27 @@ pgstat_send_bgwriter(void)
  */
 NON_EXEC_STATIC void
 PgstatCollectorMain(int argc, char *argv[])
 {
 	int			len;
 	PgStat_Msg	msg;
 	int			wr;
 
 	/*
 	 * Ignore all signals usually bound to some action in the postmaster,
-	 * except SIGHUP and SIGQUIT.  Note we don't need a SIGUSR1 handler to
-	 * support latch operations, because we only use a local latch.
+	 * except SIGHUP, SIGTERM and SIGQUIT.  Note we don't need a SIGUSR1
+	 * handler to support latch operations, because we only use a local latch.
 	 */
 	pqsignal(SIGHUP, pgstat_sighup_handler);
 	pqsignal(SIGINT, SIG_IGN);
-	pqsignal(SIGTERM, SIG_IGN);
-	pqsignal(SIGQUIT, pgstat_exit);
+	pqsignal(SIGTERM, pgstat_exit);
+	pqsignal(SIGQUIT, pgstat_quickdie);
 	pqsignal(SIGALRM, SIG_IGN);
 	pqsignal(SIGPIPE, SIG_IGN);
 	pqsignal(SIGUSR1, SIG_IGN);
 	pqsignal(SIGUSR2, SIG_IGN);
 	pqsignal(SIGCHLD, SIG_DFL);
 	pqsignal(SIGTTIN, SIG_DFL);
 	pqsignal(SIGTTOU, SIG_DFL);
 	pqsignal(SIGCONT, SIG_DFL);
 	pqsignal(SIGWINCH, SIG_DFL);
 	PG_SETMASK();
@@ -3724,40 +3725,40 @@ PgstatCollectorMain(int argc, char *argv[])
 	 */
 	init_ps_display("stats collector process", "", "", "");
 
 	/*
 	 * Read in existing stats files or initialize the stats to zero.
 	 */
 	pgStatRunningInCollector = true;
 	pgStatDBHash = pgstat_read_statsfiles(InvalidOid, true, true);
 
 	/*
-	 * Loop to process messages until we get SIGQUIT or detect ungraceful
+	 * Loop to process messages until we get SIGTERM or detect ungraceful
 	 * death of our parent postmaster.
 	 *
 	 * For performance reasons, we don't want to do ResetLatch/WaitLatch after
 	 * every 

Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-10-26 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Ashutosh Bapat
> In pgstat_quickdie(), I think a call to sigaddset(, SIGQUIT) is
> missing before PG_SETMASK(). Although there are some SIGQUIT handlers which
> do not have that call. But I guess, it will be safer to have it.

I didn't add it because pgstat_quickdie() just exits, like some other 
postmaster children.  I thought those processes which are concerned about their 
termination processing call sigaddset(SIGQUIT), so I went after the processes 
who aren't.  Is this really necessary?

> Also, many other SIGQUIT handlers like bgworker_quickdie() call
> on_exit_reset() followed by exit(2) instead of just exit(1) in
> pgstat_quickdie(). Why is this difference?

As Robert and Tom said, either exit(1) or exit(2) is OK because reaper() 
handles non-zero exit code the same.  Regarding on_proc_reset(), stats 
collector is not attached to the shared memory and does not register 
on_proc_exit() callbacks.  These situations are the same as the archiver 
process, so I followed it.

Regards
Takayuki Tsunakawa


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-10-26 Thread Tom Lane
Robert Haas  writes:
> On Wed, Oct 26, 2016 at 7:12 AM, Ashutosh Bapat
>  wrote:
>> Also, many other SIGQUIT handlers like bgworker_quickdie() call 
>> on_exit_reset()
>> followed by exit(2) instead of just exit(1) in pgstat_quickdie(). Why is this
>> difference?

> Well, for that, you'd need to look at how postmaster.c treats those
> exit codes.  exit(2) from a regular backend or background worker will
> cause a crash-and-restart cycle; I'm not sure whether the handling for
> the stats collector is similar or different.

I'm fairly sure it's different --- there are different policies for
child processes that aren't connected to shared memory (such as the
stats collector) because we shouldn't force a system-wide restart
when they crash.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-10-26 Thread Robert Haas
On Wed, Oct 26, 2016 at 7:12 AM, Ashutosh Bapat
 wrote:
> Also, many other SIGQUIT handlers like bgworker_quickdie() call 
> on_exit_reset()
> followed by exit(2) instead of just exit(1) in pgstat_quickdie(). Why is this
> difference?

Well, for that, you'd need to look at how postmaster.c treats those
exit codes.  exit(2) from a regular backend or background worker will
cause a crash-and-restart cycle; I'm not sure whether the handling for
the stats collector is similar or different.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-10-26 Thread Ashutosh Bapat
On Wed, Sep 28, 2016 at 8:37 AM, Tsunakawa, Takayuki
 wrote:
>> From: pgsql-hackers-ow...@postgresql.org
>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tom Lane
>> Allowing SIGQUIT to prompt fast shutdown of the stats collector seems sane,
>> though.  Try to make sure it doesn't leave partly-written stats files
>> behind.
>
> The attached patch based on HEAD does this.  I'd like this to be back-patched 
> because one of our important customers uses 9.2.
>
> I didn't remove partially written stat files on SIGQUIT for the following 
> reasons.  Is this OK?
>
> 1. The recovery at the next restart will remove the stat files.
> 2. SIGQUIT processing should be as fast as possible.
> 3. If writing stats files took long due to the OS page cache flushing, 
> removing files might be forced to wait likewise.
>

I agree with the first point.

The patch applies and compiles clean. make check-world is clean.

In pgstat_quickdie(), I think a call to sigaddset(, SIGQUIT) is
missing before PG_SETMASK(). Although there are some SIGQUIT handlers which do
not have that call. But I guess, it will be safer to have it.

Also, many other SIGQUIT handlers like bgworker_quickdie() call on_exit_reset()
followed by exit(2) instead of just exit(1) in pgstat_quickdie(). Why is this
difference?

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-10-05 Thread Tsunakawa, Takayuki
From: Robert Haas [mailto:robertmh...@gmail.com]
> I have no opinion on this patch, because I haven't reviewed it, but note
> recent commit 3b90e38c5d592ea8ec8236287dd5c749fc041728, which appears to
> be semi-related.

Thank you for interesting information.  Maybe Tom-san experienced some trouble 
in creating this patch.  Fortunately, this doesn't appear to be related to my 
patch, because the patch changed the timing of closing listen ports in 
postmaster children, whereas my patch explicitly closes listen ports in 
postmaster.

Regards
Takayuki Tsunakawa



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-10-04 Thread Robert Haas
On Tue, Oct 4, 2016 at 5:05 AM, Tsunakawa, Takayuki
 wrote:
> From: pgsql-hackers-ow...@postgresql.org
>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tom Lane
> FWIW, I'm pretty much -1 on messing with the timing of the socket close
>> actions.  I broke that once within recent memory, so maybe I'm gun-shy,
>> but I think that the odds of unpleasant side effects greatly outweigh any
>> likely benefit there.
>
> I couldn't find any relevant mails in pgsql-hackers.  I found no problem with 
> the attached patch.  Do you think this is OK?

I have no opinion on this patch, because I haven't reviewed it, but
note recent commit 3b90e38c5d592ea8ec8236287dd5c749fc041728, which
appears to be semi-related.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-10-04 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tom Lane
FWIW, I'm pretty much -1 on messing with the timing of the socket close
> actions.  I broke that once within recent memory, so maybe I'm gun-shy,
> but I think that the odds of unpleasant side effects greatly outweigh any
> likely benefit there.

I couldn't find any relevant mails in pgsql-hackers.  I found no problem with 
the attached patch.  Do you think this is OK?

Regards
Takayuki Tsunakawa




02_close_listen_ports_early.patch
Description: 02_close_listen_ports_early.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-09-27 Thread Tsunakawa, Takayuki
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tom Lane
> Allowing SIGQUIT to prompt fast shutdown of the stats collector seems sane,
> though.  Try to make sure it doesn't leave partly-written stats files
> behind.

The attached patch based on HEAD does this.  I'd like this to be back-patched 
because one of our important customers uses 9.2.

I didn't remove partially written stat files on SIGQUIT for the following 
reasons.  Is this OK?

1. The recovery at the next restart will remove the stat files.
2. SIGQUIT processing should be as fast as possible.
3. If writing stats files took long due to the OS page cache flushing, removing 
files might be forced to wait likewise.


> FWIW, I'm pretty much -1 on messing with the timing of the socket close
> actions.  I broke that once within recent memory, so maybe I'm gun-shy,
> but I think that the odds of unpleasant side effects greatly outweigh any
> likely benefit there.

Wasn't it related to TouchSocketFiles()?  Can I see the discussion on this ML?  
I don't see any problem looking at the code...

Regards
Takayuki Tsunakawa





01_pgstat_avoid_writing_on_sigquit.patch
Description: 01_pgstat_avoid_writing_on_sigquit.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-09-26 Thread Tom Lane
"Tsunakawa, Takayuki"  writes:
>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
>> I think that we shouldn't start changing things based on guesses about what
>> the problem is, even if they're fairly smart guesses.  The thing to do would
>> be to construct a test rig, crash the server repeatedly, and add debugging
>> instrumentation to figure out where the time is actually going.

> We have tried to reproduce the problem in the past several days with much 
> more stress on our environment than on the customer's one -- 1,000 tables 
> aiming for a dozens of times larger stats file and repeated reconnection 
> requests from hundreds of clients -- but we could not succeed.

>> I do think your theory about the stats collector might be worth pursuing.
>> It seems that the stats collector only responds to SIGQUIT, ignoring SIGTERM.
>> Making it do a clean shutdown on SIGTERM and a fast exit on SIGQUIT seems
>> possibly worthwhile.

> Thank you for giving confidence for proceeding.  And I also believe that 
> postmaster should close the listening ports earlier. Regardless of whether 
> this problem will be solved not confident these will solve the, I think it'd 
> be better to fix these two points so that postmaster doesn't longer time than 
> necessary.  I think I'll create a patch after giving it a bit more thought.

FWIW, I'm pretty much -1 on messing with the timing of the socket close
actions.  I broke that once within recent memory, so maybe I'm gun-shy,
but I think that the odds of unpleasant side effects greatly outweigh
any likely benefit there.

Allowing SIGQUIT to prompt fast shutdown of the stats collector seems
sane, though.  Try to make sure it doesn't leave partly-written stats
files behind.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-09-22 Thread Tsunakawa, Takayuki
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
> On Tue, Sep 20, 2016 at 2:20 AM, Tsunakawa, Takayuki
>  wrote:
> > There's no apparent evidence to indicate the cause, but I could guess
> > a few reasons.  What do you think these are correct and should fix
> > PostgreSQL? (I think so)
> 
> I think that we shouldn't start changing things based on guesses about what
> the problem is, even if they're fairly smart guesses.  The thing to do would
> be to construct a test rig, crash the server repeatedly, and add debugging
> instrumentation to figure out where the time is actually going.

We have tried to reproduce the problem in the past several days with much more 
stress on our environment than on the customer's one -- 1,000 tables aiming for 
a dozens of times larger stats file and repeated reconnection requests from 
hundreds of clients -- but we could not succeed.



> I do think your theory about the stats collector might be worth pursuing.
> It seems that the stats collector only responds to SIGQUIT, ignoring SIGTERM.
> Making it do a clean shutdown on SIGTERM and a fast exit on SIGQUIT seems
> possibly worthwhile.

Thank you for giving confidence for proceeding.  And I also believe that 
postmaster should close the listening ports earlier. Regardless of whether this 
problem will be solved not confident these will solve the, I think it'd be 
better to fix these two points so that postmaster doesn't longer time than 
necessary.  I think I'll create a patch after giving it a bit more thought.

Regards
Takayuki Tsunakawa




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-09-20 Thread Robert Haas
On Tue, Sep 20, 2016 at 2:20 AM, Tsunakawa, Takayuki
 wrote:
> There's no apparent evidence to indicate the cause, but I could guess a few 
> reasons.  What do you think these are correct and should fix PostgreSQL? (I 
> think so)

I think that we shouldn't start changing things based on guesses about
what the problem is, even if they're fairly smart guesses.  The thing
to do would be to construct a test rig, crash the server repeatedly,
and add debugging instrumentation to figure out where the time is
actually going.

I do think your theory about the stats collector might be worth
pursuing.  It seems that the stats collector only responds to SIGQUIT,
ignoring SIGTERM.  Making it do a clean shutdown on SIGTERM and a fast
exit on SIGQUIT seems possibly worthwhile.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-09-20 Thread Tsunakawa, Takayuki
Hello,

Please let me ask you about possible causes of a certain problem, slow shutdown 
of postmaster when a backend crashes, and whether to fix PostgreSQL.

Our customer is using 64-bit PostgreSQL 9.2.8 on RHEL 6.4.  Yes, the PostgreSQL 
version is rather old but there's no relevant bug fix in later 9.2.x releases.


PROBLEM
==

One backend process (postgres) for an application session crashed due to a 
segmentation fault and dumped a core file.  The cause is a bug of 
pg_dbms_stats.  Another note is that restart_after_crash is off to make 
failover happen.

The problem here is that postmaster took as long as 15 seconds to terminate 
after it had detected a crashed backend.  The messages were output as follows:

20:12:35.004に
LOG:  server process (PID 31894) was terminated by signal 11: Segmentation fault
DETAIL:  Failed process was running: DELETE...(snip)
LOG:  terminating any other active server processes

>From 20:12:35.013 to 20:12:39.074, the following message was output 80 times.

FATAL:  the database system is in recovery mode

20:12:50
The custom monitoring system detected the death of postmaster as a result of 
running "pg_ctl status".

That's it.  You may say the following message should also have been emitted, 
but there's not.  This is because we commented out the ereport() call in 
quickdie() in tcop.c.  That ereport() call can hang depending on the timing, 
which is fixed in 9.4.

WARNING:  terminating connection because of crash of another server process

The customer insists that PostgreSQL takes longer to shut down than expected, 
which risks exceeding their allowed failover time.


CAUSE
==

There's no apparent evidence to indicate the cause, but I could guess a few 
reasons.  What do you think these are correct and should fix PostgreSQL? (I 
think so)

1) postmaster should close the listening ports earlier
As cited above, for 4 seconds, postmaster created 80 dead-end child processes 
which just output "FATAL:  the database system is in recovery mode".  This 
indicates that postmaster is busy handling re-connection requests from 
disconnected applications, preventing postmaster from reaping dead children as 
fast as possible.  This is a waste because postmaster will only shut down.

I think the listening ports should be closed in HandleChildCrash() when the 
condition "(RecoveryError || !restart_after_crash)" is true.


2) make stats collector terminate immediately
stats collector seems to write the permanent stats file even when it receives 
SIGQUIT.  But it's useless because the stat file is reset during recovery.  And 
Tom claimed that writing stats file can take long:

https://www.postgresql.org/message-id/11800.1455135...@sss.pgh.pa.us


3) Anything else?
While postmaster is in PM_WAIT_DEAD_END state, it leaves the listening ports 
open but doesn't call select()/accept().  As a result, incoming connection 
requests are accumulated in the listen queue of the sockets.  Does the OS have 
any bug to slow the process termination when the listen queue is not empty?


Regards
Takayuki Tsunakawa



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers