Re: [HACKERS] Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system

Tomas Vondra Sun, 13 Mar 2016 18:01:10 -0700

On Sun, 2016-03-13 at 18:46 -0400, Noah Misch wrote:
> On Thu, Mar 03, 2016 at 06:08:07AM +0100, Tomas Vondra wrote:
> > 
> > On 02/03/2016 06:46 AM, Noah Misch wrote:
> > > 
> > > On Mon, Feb 01, 2016 at 07:03:45PM +0100, Tomas Vondra wrote:
> > > > 
> > > > On 12/22/2015 03:49 PM, Noah Misch wrote:
> > > > > 
> > > > > On Mon, Feb 18, 2013 at 06:19:12PM -0300, Alvaro Herrera
> > > > > wrote:
> > > > > > 
> > > > > > I have pushed it now.  Further testing, of course, is
> > > > > > always welcome.
> > > > > While investigating stats.sql buildfarm failures, mostly on
> > > > > animals
> > > > > axolotl and shearwater, I found that this patch (commit
> > > > > 187492b)
> > > > > inadvertently removed the collector's ability to coalesce
> > > > > inquiries.
> > > > > Every PGSTAT_MTYPE_INQUIRY received now causes one stats file
> > > > > write.
> > > > > Before, pgstat_recv_inquiry() did:
> > > > > 
> > > > >       if (msg->inquiry_time > last_statrequest)
> > > > >               last_statrequest = msg->inquiry_time;
> > > > > 
> > > > > and pgstat_write_statsfile() did:
> > > > > 
> > > > >       globalStats.stats_timestamp = GetCurrentTimestamp();
> > > > > ... (work of writing a stats file) ...
> > > > >               last_statwrite = globalStats.stats_timestamp;
> > > > >               last_statrequest = last_statwrite;
> > > > > 
> > > > > If the collector entered pgstat_write_statsfile() with more
> > > > > inquiries
> > > > > waiting in its socket receive buffer, it would ignore them as
> > > > > being too
> > > > > old once it finished the write and resumed message
> > > > > processing. Commit
> > > > > 187492b converted last_statrequest to a "last_statrequests"
> > > > > list that we
> > > > > wipe after each write.
> > So I've been looking at this today, and I think the attached patch
> > should do
> > the trick. I can't really verify it, because I've been unable to
> > reproduce the
> > non-coalescing - I presume it requires much slower system (axolotl
> > is RPi, not
> > sure about shearwater).
> > 
> > The patch simply checks DBEntry,stats_timestamp in
> > pgstat_recv_inquiry() and
> > ignores requests that are already resolved by the last write (maybe
> > this
> > should accept a file written up to PGSTAT_STAT_INTERVAL in the
> > past).
> > 
> > The required field is already in DBEntry (otherwise it'd be
> > impossible to
> > determine if backends need to send inquiries or not), so this is
> > pretty
> > trivial change. I can't think of a simpler patch.
> > 
> > Can you try applying the patch on a machine where the problem is
> > reproducible? I might have some RPi machines laying around (for
> > other
> > purposes).
> I've not attempted to study the behavior on slow hardware.  Instead,
> my report
> used stat-coalesce-v1.patch[1] to simulate slow writes.  (That
> diagnostic
> patch no longer applies cleanly, so I'm attaching a rebased
> version.  I've
> changed the patch name from "stat-coalesce" to "slow-stat-simulate"
> to
> more-clearly distinguish it from the "pgstat-coalesce"
> patch.)  Problems
> remain after applying your patch; consider "VACUUM pg_am" behavior:
> 
> 9.2 w/ stat-coalesce-v1.patch:
>   VACUUM returns in 3s, stats collector writes each file 1x over 3s
> HEAD w/ slow-stat-simulate-v2.patch:
>   VACUUM returns in 3s, stats collector writes each file 5x over 15s
> HEAD w/ slow-stat-simulate-v2.patch and your patch:
>   VACUUM returns in 10s, stats collector writes no files over 10s


Oh damn, the timestamp comparison in pgstat_recv_inquiry should be in
the opposite direction. After fixing that "VACUUM pg_am" completes in 3
seconds and writes each file just once.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 14afef6..e750d46 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -4836,6 +4836,20 @@ pgstat_recv_inquiry(PgStat_MsgInquiry *msg, int len)
 	}
 
 	/*
+	 * Ignore requests that are already resolved by the last write.
+	 *
+	 * We discard the queue of requests at the end of pgstat_write_statsfiles(),
+	 * so the requests already waiting on the UDP socket at that moment can't
+	 * be discarded in the previous loop.
+	 *
+	 * XXX Maybe this should also care about the clock skew, just like the
+	 *     block a few lines down.
+	 */
+	dbentry = pgstat_get_db_entry(msg->databaseid, false);
+	if ((dbentry != NULL) && (msg->cutoff_time <= dbentry->stats_timestamp))
+		return;
+
+	/*
 	 * There's no request for this DB yet, so create one.
 	 */
 	newreq = palloc(sizeof(DBWriteRequest));
@@ -4852,7 +4866,6 @@ pgstat_recv_inquiry(PgStat_MsgInquiry *msg, int len)
 	 * retreat in the system clock reading could otherwise cause us to neglect
 	 * to update the stats file for a long time.
 	 */
-	dbentry = pgstat_get_db_entry(msg->databaseid, false);
 	if ((dbentry != NULL) && (msg->clock_time < dbentry->stats_timestamp))
 	{
 		TimestampTz cur_ts = GetCurrentTimestamp();

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system

Reply via email to