Hi, `compute_distinct_stats()` is used for data types that have an equality operator but no ordering (so we can't use the sort-based path). It keeps a `track[]` array of candidate MCVs and, for each sampled row, searches that array for a match. With high `statistics_target`, that becomes O(n) per sample and can dominate ANALYZE time.
This patch keeps the existing behavior but adds a fast lookup path: - When `attstattarget >= 100`, and the type's default hash support matches the equality operator, build a small `simplehash` table mapping a tracked Datum to its current `track[]` slot (average O(1) match lookup). - Otherwise, fall back to the existing linear scan. While here, the singleton-eviction logic changes from repeatedly shifting the count=1 region to a simple round-robin cursor over that region. This keeps replacement O(1) and (when hashing is enabled) avoids having to update many hash->index mappings. Performance I used a small harness focusing on this code path (xid and aclitem, with match-heavy / unique-heavy / Zipf-ish distributions). Setup: - MacBook Pro (Apple M4 Max, 36GB RAM), macOS 26.1 (Darwin 25.1.0) - Reported numbers are median of 5 runs - For the "after" numbers below, I set the hash threshold to 0 to show the potential benefit; the patch as attached enables hashing only for `attstattarget >= 100`. Results (ms; before -> after): obj | statistics_target | before_ms | after_ms | speedup_x ----+------------------+----------+---------+--------- bench_aclitem.x_match | 25 | 174.154 | 182.346 | 0.96 bench_aclitem.x_match | 50 | 55.708 | 55.693 | 1.00 bench_aclitem.x_match | 100 | 74.311 | 63.978 | 1.16 bench_aclitem.x_match | 200 | 143.621 | 86.790 | 1.65 bench_aclitem.x_match | 500 | 462.616 | 125.083 | 3.70 bench_aclitem.x_match | 1000 | 1590.918 | 202.849 | 7.84 bench_aclitem.x_match | 2000 | 5857.253 | 406.840 | 14.40 bench_aclitem.x_match | 5000 | 30844.177 | 1678.826 | 18.37 bench_aclitem.x_match | 10000 | 73134.141 | 9207.071 | 7.94 bench_aclitem.x_unique | 25 | 18.214 | 17.962 | 1.01 bench_aclitem.x_unique | 50 | 39.305 | 37.045 | 1.06 bench_aclitem.x_unique | 100 | 76.555 | 64.800 | 1.18 bench_aclitem.x_unique | 200 | 151.416 | 90.619 | 1.67 bench_aclitem.x_unique | 500 | 497.243 | 134.351 | 3.70 bench_aclitem.x_unique | 1000 | 1788.755 | 209.299 | 8.55 bench_aclitem.x_unique | 2000 | 7249.638 | 332.865 | 21.78 bench_aclitem.x_unique | 5000 | 50785.046 | 604.944 | 83.95 bench_aclitem.x_unique | 10000 | 195506.022 | 792.658 | 246.65 bench_aclitem.x_zipf | 25 | 17.451 | 19.770 | 0.88 bench_aclitem.x_zipf | 50 | 37.131 | 35.560 | 1.04 bench_aclitem.x_zipf | 100 | 76.053 | 67.494 | 1.13 bench_aclitem.x_zipf | 200 | 145.435 | 100.484 | 1.45 bench_aclitem.x_zipf | 500 | 429.574 | 131.647 | 3.26 bench_aclitem.x_zipf | 1000 | 1313.187 | 223.918 | 5.86 bench_aclitem.x_zipf | 2000 | 4117.637 | 445.521 | 9.24 bench_aclitem.x_zipf | 5000 | 14863.325 | 1189.886 | 12.49 bench_aclitem.x_zipf | 10000 | 31706.434 | 2269.572 | 13.97 bench_xid.x_match | 25 | 22.136 | 56.767 | 0.39 bench_xid.x_match | 50 | 42.182 | 40.947 | 1.03 bench_xid.x_match | 100 | 81.089 | 67.848 | 1.20 bench_xid.x_match | 200 | 118.315 | 80.993 | 1.46 bench_xid.x_match | 500 | 403.674 | 120.421 | 3.35 bench_xid.x_match | 1000 | 1246.068 | 171.385 | 7.27 bench_xid.x_match | 2000 | 4374.122 | 406.823 | 10.75 bench_xid.x_match | 5000 | 21961.532 | 1903.715 | 11.54 bench_xid.x_match | 10000 | 55453.808 | 11035.402 | 5.03 bench_xid.x_unique | 25 | 21.062 | 20.011 | 1.05 bench_xid.x_unique | 50 | 40.473 | 41.414 | 0.98 bench_xid.x_unique | 100 | 79.711 | 67.382 | 1.18 bench_xid.x_unique | 200 | 125.255 | 72.853 | 1.72 bench_xid.x_unique | 500 | 412.332 | 96.030 | 4.29 bench_xid.x_unique | 1000 | 1373.882 | 144.837 | 9.49 bench_xid.x_unique | 2000 | 5084.898 | 277.216 | 18.34 bench_xid.x_unique | 5000 | 31412.454 | 574.693 | 54.66 bench_xid.x_unique | 10000 | 126639.495 | 804.947 | 157.33 bench_xid.x_zipf | 25 | 21.915 | 20.758 | 1.06 bench_xid.x_zipf | 50 | 41.458 | 39.498 | 1.05 bench_xid.x_zipf | 100 | 78.128 | 70.466 | 1.11 bench_xid.x_zipf | 200 | 118.361 | 79.816 | 1.48 bench_xid.x_zipf | 500 | 351.440 | 114.009 | 3.08 bench_xid.x_zipf | 1000 | 1049.475 | 173.981 | 6.03 bench_xid.x_zipf | 2000 | 3162.409 | 404.565 | 7.82 bench_xid.x_zipf | 5000 | 11170.524 | 1346.067 | 8.30 bench_xid.x_zipf | 10000 | 23928.836 | 2549.381 | 9.39 The bench harness (bench/) is attached for reproduction. Patch attached. -- Best regards, Chengpeng Yan
v1-0001-ANALYZE-speed-up-MCV-tracking-for-equality-only-t.patch
Description: v1-0001-ANALYZE-speed-up-MCV-tracking-for-equality-only-t.patch
<<attachment: bench.zip>>
