On Fri, Jan 2, 2026 at 8:45 PM David Rowley <[email protected]> wrote: > Do you have an example case of this happening? Ideally, the code that > should disfavour Memoize for this case is estimate_num_groups() as > called in cost_memoize_rescan() by returning that there's 1 group per > input row. I guess that's not happening for this case? Why?
I have seen the issue pop up a few times when the unique constraint is across multiple columns and the join is only on one of those columns (e.g. https://www.postgresql.org/message-id/CAAiQw3yBPrCw6ZLeTwVS4QhKDWgJkmmp9LnGPdodxeQmn=k...@mail.gmail.com), and a constant filter is on the other column. I think what happens is that this introduces potential for error into the sample because Postgres can now come across more duplicates of the join key than expected, reducing n_distinct, or could come across more rows with the constant filtered value, thus increasing its predicted frequency (in the case I linked, the constant's frequency was stored in the columns MCV list), leading to a nonzero hit ratio as the cardinality estimation/estcalls > ndistinct. However, I am not certain this is the case (while there probably wouldn't need to be much of an asymmetry to cause memorization given the high cost in the planner for extra index scans, it still seems odd that stats could be off enough to enable this because of sampling alone). Maybe there is a statistics bug at play? I am not certain. Jacob
