Re: [PATCH] ANALYZE: hash-accelerate MCV tracking for equality-only types

Ilia Evdokimov Wed, 21 Jan 2026 12:44:29 -0800

Hi Chengpeng,

On 21.01.2026 17:13, Chengpeng Yan wrote:

`compute_distinct_stats()` is used for data types that have an equality
operator but no ordering (so we can't use the sort-based path).  It
keeps a `track[]` array of candidate MCVs and, for each sampled row,
searches that array for a match.  With high `statistics_target`, that
becomes O(n) per sample and can dominate ANALYZE time.

Nice catch - this is indeed not first time we run into an O(N^2)bottleneck in MCV array and address it with hash-based lookup. Thanksfor working on this!


I have a couple of follow-up questions after a quick look at the patch:

1. The hash table is created but I do not see a corresponding destroy call.

2. In the original code, when a value was not found using thenested-loop search, the singleton (count = 1) region was shifted to makeroom for the new entry. After the patch, I no longer see this shiftinglogic. I might be missing something, but it looks like the non-hash pathnow follows the same replacement behavior as the hash-based one.

I’ll need a bit more time for a deeper review and some local benchmarks,but overall the approach looks interesting.


--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/

Re: [PATCH] ANALYZE: hash-accelerate MCV tracking for equality-only types

Reply via email to