I've often wanted something similar. But I've struggled to come up
with a good way to decide which parameters to keep. And as someone
mentioned, there's the question of how to deal with very large
The other day I was poking around with pg_stat_statements and jsonlog
and I thought of another way to tackle this problem that I think may
be a better approach. If we logged the queryid in slow error messages
and slow query logs that would let you deal with larger data and also
keep more history without burdening the live system.
If the queryid was a column in the CSV logs (or a field in json logs,
etc) then you people who load their logs into a database for handling
would be able to index that column and quickly look up example
queries, sort them by time taken, or analyze them in other ways. Using
jsonlog you could do the same thing in Elasticsearch/Kibana.
I tried to hack this together quickly but it's actually a bit of a
pain for mundane reasons. Our current slow query logs are actually
slow *statement* logs which makes it a bit of an impedance mismatch
with pg_stat_statements which works per planned query. I think the
solution to this would be to drop the slow statement logs and have
pg_stat_statements log slow queries directly in the ExecutorEnd hook.
It would be nice to have the queryid be accessible for other logs as
well like debug_query_str is. I'm not sure the right way to do that
though. I tried just peeking in ActivePortal->plannedstmt but that's
not always set (and in particular is not set at the point slow
statements are logged). And it was causing crashes, presumably
ActivePortal is left uninitialized in some background worker that
doesn't need it?