Re: [HACKERS] Broken system timekeeping breaks the stats collector

2012-06-17 Thread Simon Riggs
On 17 June 2012 08:26, Tom Lane t...@sss.pgh.pa.us wrote:

 (1) In backend_read_statsfile, make an initial attempt to read the stats
 file and then read GetCurrentTimestamp after that.  If the local clock
 reading is less than the stats file's timestamp, we know that some sort
 of clock skew or glitch has happened, so force an inquiry message to be
 sent with the local timestamp.  But then accept the stats file anyway,
 since the skew might be small and harmless.  The reason for the forced
 inquiry message is to cause (2) to happen at the collector.

Fine, but please log this as a WARNING system time skew detected, so
we can actually see it has happened rather than just silently
accepting the situation.

It would be useful to document whether there are any other negative
effects from altering system time.

Perhaps we should do the same test at startup to see if the clock has
gone backwards then also. Perhaps we should also make note of any
major changes in time since last startup, which might help us detect
other forms of corruption.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Broken system timekeeping breaks the stats collector

2012-06-17 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 Fine, but please log this as a WARNING system time skew detected, so
 we can actually see it has happened rather than just silently
 accepting the situation.

I think elog(LOG) is more appropriate, same as we have for the existing
messages for related complaints.  No one backend is going to have a
complete view of the situation, and the collector itself has to use
LOG since it has no connected client at all.  So the postmaster log
is the place to look for evidence of clock trouble.

 Perhaps we should do the same test at startup to see if the clock has
 gone backwards then also.

Uh ... backwards from what?  And what difference would it make?  We
always force an immediate write of the stats file at startup anyway.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Broken system timekeeping breaks the stats collector

2012-06-16 Thread Tom Lane
I've had a vague feeling for awhile now that the occasional buildfarm
failures we see in the stats regression test (where it appears that the
stats collector fails to respond to requests for no good reason) might
be related to operating-system timekeeping glitches.  Today there is
finally indisputable evidence of this happening, in this log file:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=frogmouthdt=2012-06-16%2014%3A30%3A12
Observe the following log excerpt:

[4fdcaca3.e74:478] LOG:  statement: CREATE INDEX dupindexcols_i ON dupindexcols 
(f1, id, f1 text_pattern_ops);
[4fdcaca3.e74:479] LOG:  statement: ANALYZE dupindexcols;
[4fdcaca3.e74:480] LOG:  statement: EXPLAIN (COSTS OFF)
  SELECT count(*) FROM dupindexcols
WHERE f1  'WA' and id  1000 and f1 ~~ 'YX';
[4fdcaca3.e74:481] LOG:  statement: SELECT count(*) FROM dupindexcols
  WHERE f1  'WA' and id  1000 and f1 ~~ 'YX';
[4fdcaca3.e74:482] LOG:  disconnection: session time: 0:00:00.000 user=pgrunner 
database=regression host=::1 port=1123
[4fdcab40.e04:1] LOG:  connection received: host=::1 port=1125
[4fdcab40.e04:2] LOG:  connection authorized: user=pgrunner database=regression
[4fdcab40.e04:3] LOG:  statement: CREATE TABLE a (aa TEXT);
[4fdcab40.444:1] LOG:  connection received: host=::1 port=1126
[4fdcab40.e88:1] LOG:  connection received: host=::1 port=1127
[4fdcab40.444:2] LOG:  connection authorized: user=pgrunner database=regression
[4fdcab40.444:3] LOG:  statement: CREATE USER regtest_unpriv_user;
[4fdcab40.444:4] LOG:  statement: CREATE SCHEMA temp_func_test;
[4fdcab40.e04:4] LOG:  statement: CREATE TABLE b (bb TEXT) INHERITS (a);

We can tell from the statements being executed that session 4fdcaca3.e74
is running the create_index regression test, while 4fdcab40.444 is
running create_function_3, and the other sessions starting concurrently
with it are part of the parallel group that runs after create_index.
So they certainly didn't start until 4fdcaca3.e74 finished.  Also
create_index is not exactly a zero-time test, so the fact that
4fdcaca3.e74 reported a session runtime of 0:00:00.000 should already
set off some alarm bells.  But remember that in the %c log_line_prefix
escape, the first part is the process's start-time timestamp in hex.
So 4fdcaca3 means Sat Jun 16 2012, 11:56:19 EDT while 4fdcab40 means
Sat Jun 16 2012, 11:50:24 EDT, nearly six minutes *earlier*.

The reported zero session time is explainable by the fact that
TimestampDifference returns zeros if the given timestamps are out of
order.  All the other process timestamps in the log are likewise
consistent with the theory that the system's clock went backwards
six-plus minutes while create_index was running.

Then, when we finally get to the stats regression test, it fails with
symptoms indicating that the stats collector never answered the
backend's requests for an updated stats file; and when the postmaster
finally shuts down and tells the stats collector to shut down too, the
final pgstat_write_statsfile call bleats like so:
[4fdcac54.a0c:1] LOG:  last_statrequest 2012-06-16 11:55:20.813625-04 is later 
than collector's time 2012-06-16 11:54:12.392375-04

Once you know that the system clock glitched like that, it's fairly
obvious what happened inside the stats collector: it wrote out the stats
file (probably in response to an autovacuum request) sometime shortly
after 11:55:20, and then the system clock went backwards to around
11:50, and so all subsequent inquiry messages had request timestamps
older than last_statwrite causing the collector to believe it needn't
emit fresh stats files.

Aside from breaking the stats regression test, a scenario like this one
has bad implications for autovacuum: for something like six minutes,
autovacuum would have been operating with stale pgstat data since the
stats file would not get rewritten, and backends would see the file
timestamp as being ahead of their own clocks so they wouldn't think it
was stale.  (So this scenario doesn't explain the occasional pgstat
wait timeout failures we see; though perhaps someone can think of a
variant that fits that symptom?)

I will refrain from disparaging Windows here and just note that this
would be an unsurprising occurrence on any machine not running NTP: the
owner might occasionally fix the clock once it had drifted far enough
from reality.  So I think it might be advisable to install some defenses
against the case rather than assuming it can't happen.

The most direct fix would involve checking GetCurrentTimestamp against
last_statswrite in the stats collector's main loop, but on a machine
with slow gettimeofday calls that would be pretty bad for performance.
What I suggest instead is:

(1) In backend_read_statsfile, make an initial attempt to read the stats
file and then read GetCurrentTimestamp after that.  If the local clock
reading is less than the stats file's timestamp, we know that some sort
of clock skew or glitch has happened, so force an inquiry 

Re: [HACKERS] Broken system timekeeping breaks the stats collector

2012-06-16 Thread Dickson S. Guedes
2012/6/16 Tom Lane t...@sss.pgh.pa.us:
[... cut ...]
 (1) In backend_read_statsfile, make an initial attempt to read the stats
 file and then read GetCurrentTimestamp after that.  If the local clock
 reading is less than the stats file's timestamp, we know that some sort
 of clock skew or glitch has happened, so force an inquiry message to be
 sent with the local timestamp.  But then accept the stats file anyway,
 since the skew might be small and harmless.  The reason for the forced
 inquiry message is to cause (2) to happen at the collector.

 (2) In pgstat_recv_inquiry, if the received inquiry_time is older than
 last_statwrite, we should suspect a clock glitch (though it might just
 indicate delayed message receipt).  In this case, do a fresh
 GetCurrentTimestamp call, and if the reading is less than
 last_statwrite, we know that the collector's time went backwards.
 To recover, reset these variables as we do at startup:
        last_statrequest = GetCurrentTimestamp();
        last_statwrite = last_statrequest - 1;
 to force an immediate write to happen with the new local time.

 (1) is basically free in terms of the amount of work done in non-broken
 cases, though it will require a few more lines of code.  (2) means
 adding some GetCurrentTimestamp calls that did not occur before, but
 hopefully these will be infrequent, since in the absence of clock
 glitches they would only happen when a backend's demand for a new stats
 file is generated before the collector starts to write a new stats file
 but not received till afterwards.

 Comments?  Anyone see a flaw in this design?  Or want to argue that
 we shouldn't do anything about such cases?

What happens when Daylight saving time ends? Or it doesn't matter in
this scenario?

regards
-- 
Dickson S. Guedes
mail/xmpp: gue...@guedesoft.net - skype: guediz
http://guedesoft.net - http://www.postgresql.org.br

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Broken system timekeeping breaks the stats collector

2012-06-16 Thread Tom Lane
Dickson S. Guedes lis...@guedesoft.net writes:
 What happens when Daylight saving time ends? Or it doesn't matter in
 this scenario?

Irrelevant, we're working in UTC-based timestamps.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers