Have you benchmarked this change except in first message in this thread?
While reviewing the patch more closely, I noticed that compute_distinct_stats() is only used for types where we have =, != but not <. In practice, most common scalar types go through compute_scalar_stats() instead.
That makes me wonder how often this optimization would actually trigger in real workloads. Since compute_scalar_stats() is the more common path, there's chance that the hash-table based improvement in compute_distinct_stats() may not provide a noticeable overall benefit.
-- Best regards, Ilia Evdokimov, Tantor Labs LLC, https://tantorlabs.com/
