Re: [PROPOSAL] timestamp informations to pg_stat_statements
> No, the entries are not removed randomly. We track "usage" for each > entry (essentially +1 for each time the query got executed, with a decay > factor applied on each eviction (and we evict 5% at a time). OK I apologize, I hadn't looked in source code in details, and effectively the "Usage" algorithm based on execution_time and calls will keep the longest queries (and that's a good thing). In some cases (after an application release or a postgres patch for exemple) we may be interested in leastiests ones. I'm still convinced that adding those kinds of informations (with a planid as done in pg_stat_plans) would be an improvement. Maybe because having used v$sql, first_load_time, last_active_time and plan_hash_value for (too) many years). It's always important to know when a new plan (good or bad) occurs, and pgss could give this kind of informations even without aggregation collector. Last point (for developer), after testing this patch, in most cases when calls=1: created and last_updated values are identical, they don't even differ to reflect execution (nor planing) duration. Is that a precision or coding problem ? Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: [PROPOSAL] timestamp informations to pg_stat_statements
On 03/10/2018 04:43 PM, legrand legrand wrote: > +1 > Having the time of first occurence of a statement is very usefull > for trouble shouting, it permits for exemple to retrieve the order of > operations in some complex cases (and thoses informations aren't > taken by any third party collecting tool, that will only be able to > provide a time range of occurence). > I really don't see how this would be useful in reconstructing order of operations, particularly in complex cases where I'd expect the queries to be executed repeatedly / interleaved in different ways. Furthermore, it would only work for the very first execution of all statements, so you would probably have to reset the stats over and over - which seems to directly contradict the purpose of pg_stat_statements (aggregation of data over longer periods of time). So unfortunately this seems rather useless, and log_statements=all seems like a much better / reliable approach to achieve that. I also doubt it really allows computation of averages, e.g. queries per second, because practical workloads usually have daily/weekly patterns (and different queries may follow different patterns). So I agree with Peter Geoghegan that tools regularly snapshotting pg_stat_statements and processing that are a better approach. I've seen a bunch of simple scripts doing just that, actually. > I thougth that pgss rows where removed randomly when max rows was > reached, wouldn't having last_executed information permit a kind of > LRU removal ? > No, the entries are not removed randomly. We track "usage" for each entry (essentially +1 for each time the query got executed, with a decay factor applied on each eviction (and we evict 5% at a time). It's not immediately obvious why something based on time of the first/last execution would be better than the current algorithm. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [PROPOSAL] timestamp informations to pg_stat_statements
+1 Having the time of first occurence of a statement is very usefull for trouble shouting, it permits for exemple to retrieve the order of operations in some complex cases (and thoses informations aren't taken by any third party collecting tool, that will only be able to provide a time range of occurence). I thougth that pgss rows where removed randomly when max rows was reached, wouldn't having last_executed information permit a kind of LRU removal ? Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html