Re: [HACKERS] Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Peter Geoghegan Thu, 29 Mar 2012 04:14:45 -0700

On 29 March 2012 02:09, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Thanks.  I've committed the patch along with the docs, after rather
> heavy editorialization.


Thank you.

> 1. What to do with EXPLAIN, SELECT INTO, etc.  We had talked about
> tweaking the behavior of statement nesting and some other possibilities.
> I think clearly this could use improvement but I'm not sure just how.
> (Note: I left out the part of your docs patch that attempted to explain
> the current behavior, since I think we should fix it not document it.)

Yeah, this is kind of unsatisfactory. Nobody would expect the module
to behave this way. On the other hand, users probably aren't hugely
interested in this information.

I'm still kind of attached to the idea of exposing the hash value in
the view. It could be handy in replication situations, to be able to
aggregate statistics across the cluster (assuming you're using
streaming replication and not a trigger based system). You need a
relatively stable identifier to be able to do that. You've already
sort-of promised to not break the format in point releases, because it
is serialised to disk, and may have to persist for months or years.
Also, it will drive home the reality of what's going on in situations
like this (from the docs):

"In some cases, queries with visibly different texts might get merged
into a single pg_stat_statements entry. Normally this will happen only
for semantically equivalent queries, but there is a small chance of
hash collisions causing unrelated queries to be merged into one entry.
(This cannot happen for queries belonging to different users or
databases, however.)

Since the hash value is computed on the post-parse-analysis
representation of the queries, the opposite is also possible: queries
with identical texts might appear as separate entries, if they have
different meanings as a result of factors such as different
search_path settings."

> 2. Whether and how to adjust the aging-out of sticky entries.  This
> seems like a research project, but the code impact should be quite
> localized.

As I said, I'll try and give it some thought, and do some experiments.

> BTW, I eventually concluded that the parameterization testing we were
> worried about before was a red herring.  As committed, the patch tries
> to store a normalized string if it found any deletable constants, full
> stop.  This seems to me to be correct behavior because the presence of
> constants is exactly what makes the string normalizable, and such
> constants *will* be ignored in the hash calculation no matter whether
> there are other parameters or not.

Yeah, that aspect of not canonicalising parametrised entries had
bothered me. I guess we're better off gracefully handling the problem
across the board, rather than attempting to band-aid the problem up by
just not having speculative hashtable entries in cases where they
arguably are not so useful. Avoiding canonicalising those constants
was somewhat misleading.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

Reply via email to