Re: [HACKERS] Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system

Noah Misch Sun, 13 Mar 2016 15:48:00 -0700

On Thu, Mar 03, 2016 at 06:08:07AM +0100, Tomas Vondra wrote:
> On 02/03/2016 06:46 AM, Noah Misch wrote:
> >On Mon, Feb 01, 2016 at 07:03:45PM +0100, Tomas Vondra wrote:
> >>On 12/22/2015 03:49 PM, Noah Misch wrote:
> >>>On Mon, Feb 18, 2013 at 06:19:12PM -0300, Alvaro Herrera wrote:
> >>>>I have pushed it now.  Further testing, of course, is always welcome.
> >>>
> >>>While investigating stats.sql buildfarm failures, mostly on animals
> >>>axolotl and shearwater, I found that this patch (commit 187492b)
> >>>inadvertently removed the collector's ability to coalesce inquiries.
> >>>Every PGSTAT_MTYPE_INQUIRY received now causes one stats file write.
> >>>Before, pgstat_recv_inquiry() did:
> >>>
> >>>   if (msg->inquiry_time > last_statrequest)
> >>>           last_statrequest = msg->inquiry_time;
> >>>
> >>>and pgstat_write_statsfile() did:
> >>>
> >>>   globalStats.stats_timestamp = GetCurrentTimestamp();
> >>>... (work of writing a stats file) ...
> >>>           last_statwrite = globalStats.stats_timestamp;
> >>>           last_statrequest = last_statwrite;
> >>>
> >>>If the collector entered pgstat_write_statsfile() with more inquiries
> >>>waiting in its socket receive buffer, it would ignore them as being too
> >>>old once it finished the write and resumed message processing. Commit
> >>>187492b converted last_statrequest to a "last_statrequests" list that we
> >>>wipe after each write.
> 
> So I've been looking at this today, and I think the attached patch should do
> the trick. I can't really verify it, because I've been unable to reproduce the
> non-coalescing - I presume it requires much slower system (axolotl is RPi, not
> sure about shearwater).
> 
> The patch simply checks DBEntry,stats_timestamp in pgstat_recv_inquiry() and
> ignores requests that are already resolved by the last write (maybe this
> should accept a file written up to PGSTAT_STAT_INTERVAL in the past).
> 
> The required field is already in DBEntry (otherwise it'd be impossible to
> determine if backends need to send inquiries or not), so this is pretty
> trivial change. I can't think of a simpler patch.
> 
> Can you try applying the patch on a machine where the problem is
> reproducible? I might have some RPi machines laying around (for other
> purposes).


I've not attempted to study the behavior on slow hardware.  Instead, my report
used stat-coalesce-v1.patch[1] to simulate slow writes.  (That diagnostic
patch no longer applies cleanly, so I'm attaching a rebased version.  I've
changed the patch name from "stat-coalesce" to "slow-stat-simulate" to
more-clearly distinguish it from the "pgstat-coalesce" patch.)  Problems
remain after applying your patch; consider "VACUUM pg_am" behavior:

9.2 w/ stat-coalesce-v1.patch:
  VACUUM returns in 3s, stats collector writes each file 1x over 3s
HEAD w/ slow-stat-simulate-v2.patch:
  VACUUM returns in 3s, stats collector writes each file 5x over 15s
HEAD w/ slow-stat-simulate-v2.patch and your patch:
  VACUUM returns in 10s, stats collector writes no files over 10s


[1] 
http://www.postgresql.org/message-id/[email protected]

diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 14afef6..4308df2 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -3444,6 +3444,7 @@ pgstat_send_bgwriter(void)
 NON_EXEC_STATIC void
 PgstatCollectorMain(int argc, char *argv[])
 {
+       unsigned total = 0;
        int                     len;
        PgStat_Msg      msg;
        int                     wr;
@@ -3555,6 +3556,10 @@ PgstatCollectorMain(int argc, char *argv[])
                                                 errmsg("could not read 
statistics message: %m")));
                        }
 
+                       elog(LOG, "stats %d: %u + %u = %u",
+                                msg.msg_hdr.m_type, total, len, total + len);
+                       total += len;
+
                        /*
                         * We ignore messages that are smaller than our common 
header
                         */
@@ -3947,6 +3952,13 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
         */
        fputc('E', fpout);
 
+       if (1)
+       {
+               PG_SETMASK(&BlockSig);
+               pg_usleep(3 * 1000000L);
+               PG_SETMASK(&UnBlockSig);
+       }
+
        if (ferror(fpout))
        {
                ereport(LOG,
diff --git a/src/test/regress/expected/stats.out 
b/src/test/regress/expected/stats.out
index a811265..064cf9f 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -63,6 +63,9 @@ begin
     SELECT (n_tup_ins > 0) INTO updated3
       FROM pg_stat_user_tables WHERE relname='trunc_stats_test';
 
+    raise log 'stats updated as of % snapshot: 1:% 2:% 3:%',
+      pg_stat_get_snapshot_timestamp(), updated1, updated2, updated3;
+
     exit when updated1 and updated2 and updated3;
 
     -- wait a little
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index b3e2efa..d252124 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -58,6 +58,9 @@ begin
     SELECT (n_tup_ins > 0) INTO updated3
       FROM pg_stat_user_tables WHERE relname='trunc_stats_test';
 
+    raise log 'stats updated as of % snapshot: 1:% 2:% 3:%',
+      pg_stat_get_snapshot_timestamp(), updated1, updated2, updated3;
+
     exit when updated1 and updated2 and updated3;
 
     -- wait a little

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system

Reply via email to