On Thu, 20 Mar 2025 at 21:48, Ilia Evdokimov <ilya.evdoki...@tantorlabs.com> wrote: > -> Memoize (cost=0.30..0.41 rows=1 width=4) > Cache Key: t2.thousand > Cache Mode: logical > Cache Estimated Entries: 655 > Cache Estimated NDistinct: 721 > -> Index Only Scan using tenk1_unique1 on tenk1 t1 > (cost=0.29..0.40 rows=1 width=4) > Index Cond: (unique1 = t2.thousand) > (11 rows) > > Additionally, since this information would only be shown in EXPLAIN when > costs are enabled, it should not cause any performance regression in normal > execution. However, reviewers should be especially careful when verifying > test outputs, as this change could affect plan details in regression tests. > > Any thoughts?
> + ExplainIndentText(es); > + appendStringInfo(es->str, "Cache Estimated Entries: %d\n", ((Memoize *) > plan)->est_entries); > + ExplainIndentText(es); > + appendStringInfo(es->str, "Cache Estimated NDistinct: %0.0f\n", ((Memoize > *) plan)->ndistinct); est_entries is a uint32, so %u is the correct format character for that type. I don't think you need to prefix all these properties with "Cache" just because the other two properties have that prefix. I also don't think the names you've chosen really reflect the meaning. How about something like: "Estimated Distinct Lookup Keys: 721 Estimated Capacity: 655", in that order. I think maybe having that as one line for format=text is better than 2 lines. The EXPLAIN output is already often taking up more lines in v18 than in v17, would be good to not make that even worse unnecessarily. I see the existing code there could use ExplainPropertyText rather than have a special case for text and non-text formats. That's likely my fault. If we're touching this code, then we should probably tidy that up. Do you want to create a precursor fixup patch for that? + double ndistinct; /* Estimated number of distinct memoization keys, + * used for cache size evaluation. Kept for EXPLAIN */ Maybe this field in MemoizePath needs a better name. How about "est_unique_keys"? and also do the rename in struct Memoize. I'm also slightly concerned about making struct Memoize bigger. I had issues with a performance regression [1] for 908a96861 when increasing the WindowAgg struct size last year and the only way I found to make it go away was to shuffle the fields around so that the struct size didn't increase. I think we'll need to see a benchmark of a query that hits Memoize quite hard with a small cache size to see if the performance decreases as a result of adding the ndistinct field. It's unfortunate that we'll not have the luxury of squeezing this double into padding if we do see a slowdown. David [1] https://postgr.es/m/flat/CAHoyFK9n-QCXKTUWT_xxtXninSMEv%2BgbJN66-y6prM3f4WkEHw%40mail.gmail.com