On 08.09.2025 13:56, David Geier wrote:
To evaluate it, I ran benchmarks on JOB with three variants:
$ ./benchmark.sh master
$ ./benchmark.sh merge
$ ./benchmark.sh hash
I compared total planning time across all 113 queries.
Was this running with optimizations? How did you extract the planning time?
I save all query plans using EXPLAIN SUMMARY, then go through all the
plans, read the 'Planning Time' for each, and sum them up.
I would have expected the delta between the "merge" and "hash" variant
to be bigger, especially for default_statistics_target=10000. My small
test also showed that. Any idea why this is not showing in your results?
So would I. With default_statistics_target = 10000 and the selectivity
in the JOB queries being close to zero, the difference should be
noticeable. I can only explain the previous results by cache-related
effects on my machine.
I reran the benchmark on a clean cluster and collected the top slowest
JOB queries — now the effect is clearly visible.
Merge (sum of all JOB queries)
==================
default_statistics_target | Planner Speedup (×) | Planner Before (ms) |
Planner After (ms)
--------------------------------------------------------------------------------
100 | *1.00* | 1888.105
| 1879.431
1000 | *1.14* | 2282.239
| 2009.114
2500 | *2.10* | 5595.030
| 2668.530
5000 | *5.56* | 18544.933
| 3333.252
7500 | *9.17* | 37390.956
| 4076.390
10000 | *16.10* | 69319.479
| 4306.417
HashMap (sum of all JOB queries)
==================
default_statistics_target | Planner Speedup (×) | Planner Before (ms) |
Planner After (ms)
--------------------------------------------------------------------------------
100 | *1.03* | 1888.105 |
1828.088
1000 | *1.18* | 2282.239 |
1939.884
2500 | *2.64* | 5595.030 |
2117.872
5000 | *7.80* | 18544.933 |
2377.206
7500 | *13.80* | 37390.956 |
2709.973
10000 | *23.32* | 69319.479 |
2973.073
Top 10 slowest JOB queries (default_statistics_target = 10000)
Query | master (ms) | merge (ms) | Hash (ms)
------+-------------+------------+-----------
29c | 1904.586 | 144.135 | 100.473
29b | 1881.392 | 117.891 | 89.028
29a | 1868.805 | 112.242 | 83.913
31c | 1867.234 | 76.498 | 56.140
30c | 1646.630 | 88.494 | 62.549
30b | 1608.820 | 84.821 | 64.603
31a | 1573.964 | 75.978 | 56.140
28a | 1457.738 | 95.939 | 77.309
28b | 1455.052 | 99.383 | 73.065
30a | 1416.699 | 91.057 | 62.549
BTW, the hashmap from your patch could also be applied to
eqjoinsel_semi() function.
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com