On 11/29/2018 10:59 AM, Stephen Frost wrote:
Greetings,

* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
On 2018-Nov-28, Tom Lane wrote:
Alvaro Herrera <alvhe...@2ndquadrant.com> writes:
On 2018-Nov-28, Tom Lane wrote:
This would also entail rather significant overhead to find out schema
names and interpolate them into the text.
True.  I was thinking that the qualified-names version of the query
would be obtained via ruleutils or some similar mechanism to deparse
from the parsed query tree (not from the original query text), where
only pg_catalog is considered visible.  This would be enabled using a
GUC that defaults to off.
Color me skeptical --- ruleutils has never especially been designed
to be fast, and I can't see that the overhead of this is going to be
acceptable to anybody who needs pg_stat_statements in production.
(Some admittedly rough experiments suggest that we might be
talking about an order-of-magnitude slowdown for simple queries.)
Good point.

Maybe we can save the OID array of schemas that are in search_path when
the query is first entered into the statement pool, and produce the
query_qn column only at the time the entry is interpreted (that is, when
pg_stat_statements is query).  ... oh, but that requires saving the plan
tree too, which doesn't sound very convenient.

Maybe just storing the search_path schemas (as Tomas already suggested)
is sufficient for Sergei's use case?  Do away with query_qn as such, and
just have the user interpret the names according to the stored
search_path.
Seems like what you'd really want is to store all the environment, not
just the search_path (consider the $user case...).  Maybe saving just
the OIDs of the search_path and then using them later would also work
but it seems like we're just building up to tracking everything and
doing it piecemeal with an extra column added in this release, another
in the next release, etc..

Thanks!

Stephen
It's a valid concern. Instead of the adding just search_path column we can add a column session_info jsonb.
Its content can be defined by the new configuration parameter

pg_stat_statements.session_info ('current_schemas, current_user, session_user') // a subset of the data from the system information functions

and it will have data like
{
    "current_schemas" : ["pg_catalog", "s1", "s2", "public"],
    "current_user" : "user1",
    "session_user" : "user1"
}

It will allow the DBA/developer to reproduce a performance issue, and will allow the deeper level of granularity for the reporting tools. It's more complex than I have anticipated but doesn't break backward compatibility and extensible.

Thank you,

Sergei


Reply via email to