On Mon, Feb 01, 2016 at 07:03:45PM +0100, Tomas Vondra wrote:
> On 12/22/2015 03:49 PM, Noah Misch wrote:
> >On Mon, Feb 18, 2013 at 06:19:12PM -0300, Alvaro Herrera wrote:
> >>I have pushed it now.  Further testing, of course, is always welcome.
> >
> >While investigating stats.sql buildfarm failures, mostly on animals axolotl
> >and shearwater, I found that this patch (commit 187492b) inadvertently 
> >removed
> >the collector's ability to coalesce inquiries.  Every PGSTAT_MTYPE_INQUIRY
> >received now causes one stats file write.  Before, pgstat_recv_inquiry() did:
> >
> >     if (msg->inquiry_time > last_statrequest)
> >             last_statrequest = msg->inquiry_time;
> >
> >and pgstat_write_statsfile() did:
> >
> >     globalStats.stats_timestamp = GetCurrentTimestamp();
> >... (work of writing a stats file) ...
> >             last_statwrite = globalStats.stats_timestamp;
> >             last_statrequest = last_statwrite;
> >
> >If the collector entered pgstat_write_statsfile() with more inquiries waiting
> >in its socket receive buffer, it would ignore them as being too old once it
> >finished the write and resumed message processing.  Commit 187492b converted
> >last_statrequest to a "last_statrequests" list that we wipe after each write.
> So essentially we remove the list of requests, and thus on the next round we
> don't know the timestamp of the last request and write the file again
> unnecessarily. Do I get that right?

Essentially right.  Specifically, for each database, we must remember the
globalStats.stats_timestamp of the most recent write.  It could be okay to
forget the last request timestamp.  (I now doubt I picked the best lines to
quote, above.)

> What if we instead kept the list but marked the requests as 'invalid' so
> that we know the timestamp? In that case we'd be able to do pretty much
> exactly what the original code did (but at per-db granularity).

The most natural translation of the old code would be to add a write_time
field to struct DBWriteRequest.  One can infer "invalid" from write_time and
request_time.  There are other reasonable designs, though.

> We'd have to cleanup the list once in a while not to grow excessively large,
> but something like removing entries older than PGSTAT_STAT_INTERVAL should
> be enough.

Specifically, if you assume the socket delivers messages in the order sent,
you may as well discard entries having write_time at least
PGSTAT_STAT_INTERVAL older than the most recent cutoff_time seen in a
PgStat_MsgInquiry.  That delivery order assumption does not hold in general,
but I expect it's close enough for this purpose.

> >As for how to fix this, the most direct translation of the old logic is to
> >retain last_statrequests entries that could help coalesce inquiries.I lean
> >toward that for an initial, back-patched fix.
> That seems reasonable and I believe it's pretty much the idea I came up with
> above, right? Depending on how you define "entries that could help coalesce
> inquiries".


Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to