I will mull over this and see if I can suggest a short phrase that points to the nuances without explaining them. I think the current phrasing draws attention to itself by seeming dubious, which creates a kind of cognitive dissonance for the reader. This report isn't mean to be a bug report on the functionality.
On Sat, Jul 15, 2017 at 10:17 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > Peter Geoghegan <p...@bowt.ie> writes: > > On Sat, Jul 15, 2017 at 5:31 PM, <ba...@xaprb.com> wrote: > >> I wonder if "least-executed" is correct. I'm not an > expert and haven't > >> convinced myself of this by examining the code, but I think after N > distinct > >> queryid's have been seen, then any additional ones are ignored. But > that may > >> not be "least-executed" at all. It's > "most-recent" instead. I think we need > >> a new phrase here. > > > It's most executed since tracking for the entry began, with a special > > heuristic for queries that take a long time to execute, and might > > therefore consistently be evicted before execution finishes and costs > > are tallied (see "sticky entries" stuff for full details). Most > > executed means the total number of calls, which may not be the best > > thing to evict on the basis of, but certainly isn't too bad. > > > The way it actually works is that either 5% of all entries or 10 > > entries are evicted (whichever amount is greatest) once > > pg_stat_statements.max entries are reached. You're right that this > > means that the most marginal of entries cannot be usefully tracked, > > but I doubt that that's much of a problem in practice. It's the usual > > "recency versus frequency" cache eviction problem, but for query cost > > tracking purposes if 5,000 entries or 10,000 entries is truly > > insufficient, then pg_stat_statements probably isn't the right tool. > > The short answer, really, is that the algorithm is too complicated to be > worth explaining in the documentation --- and it's subject to change, > anyway. But "least-executed" is a reasonable short description, since > frequency of use is a major factor in the decisions. Certainly > "most-recent" is flat out wrong. > > I am not sure whether this complaint is actually meant as a bug report > that the algorithm didn't seem to work well on the OP's use case. If so, > we'd need a lot more details to have any hope of improving it (and the > documentation comments aren't the right submission forum, either). > > regards, tom lane >