> 1) It's better to always include the whole patch series - including the > parts that have not changed. Otherwise people have to scavenge the > thread and search for all the pieces, which may be a source of issues. > Also, it confuses the patch tester [1] which tries to apply patches from > a single message, so it will fail for this one. Pathes 3 and 4 do not rely on 1 and 2 in code. But, it will fail when you apply the apatches 3 and 4 directly, because they are written after 1 and 2. I can generate a new single patch if you need.
> 2) I suggest you try to describe the goal of these patches, using some > example queries, explain output etc. Right now the reviewers have to > reverse engineer the patches and deduce what the intention was, which > may be causing unnecessary confusion etc. If this was my patch, I'd try > to create a couple examples (CREATE TABLE + SELECT + EXPLAIN) showing > how the patch changes the query plan, showing speedup etc. I written some example queries in to regress, include "unique" "union" "group by" and "group by grouping sets". here is my tests, they are not in regress ```sql begin; create table gtest(id integer, txt text); insert into gtest select t1.id,'txt'||t1.id from (select generate_series(1,1000*1000) id) t1,(select generate_series(1,10) id) t2; analyze gtest; commit; set jit = off; \timing on ``` normal aggregate times ``` set enable_batch_hashagg = off; explain (costs off,analyze,verbose) select sum(id),txt from gtest group by txt; QUERY PLAN ------------------------------------------------------------------------------------------------------------- Finalize GroupAggregate (actual time=6469.279..8947.024 rows=1000000 loops=1) Output: sum(id), txt Group Key: gtest.txt -> Gather Merge (actual time=6469.245..8165.930 rows=1000058 loops=1) Output: txt, (PARTIAL sum(id)) Workers Planned: 2 Workers Launched: 2 -> Sort (actual time=6356.471..7133.832 rows=333353 loops=3) Output: txt, (PARTIAL sum(id)) Sort Key: gtest.txt Sort Method: external merge Disk: 11608kB Worker 0: actual time=6447.665..7349.431 rows=317512 loops=1 Sort Method: external merge Disk: 10576kB Worker 1: actual time=6302.882..7061.157 rows=333301 loops=1 Sort Method: external merge Disk: 11112kB -> Partial HashAggregate (actual time=2591.487..4430.437 rows=333353 loops=3) Output: txt, PARTIAL sum(id) Group Key: gtest.txt Batches: 17 Memory Usage: 4241kB Disk Usage: 113152kB Worker 0: actual time=2584.345..4486.407 rows=317512 loops=1 Batches: 17 Memory Usage: 4241kB Disk Usage: 101392kB Worker 1: actual time=2584.369..4393.244 rows=333301 loops=1 Batches: 17 Memory Usage: 4241kB Disk Usage: 112832kB -> Parallel Seq Scan on public.gtest (actual time=0.691..603.990 rows=3333333 loops=3) Output: id, txt Worker 0: actual time=0.104..607.146 rows=3174970 loops=1 Worker 1: actual time=0.100..603.951 rows=3332785 loops=1 Planning Time: 0.226 ms Execution Time: 9021.058 ms (29 rows) Time: 9022.251 ms (00:09.022) set enable_batch_hashagg = on; explain (costs off,analyze,verbose) select sum(id),txt from gtest group by txt; QUERY PLAN ------------------------------------------------------------------------------------------------- Gather (actual time=3116.666..5740.826 rows=1000000 loops=1) Output: (sum(id)), txt Workers Planned: 2 Workers Launched: 2 -> Parallel BatchHashAggregate (actual time=3103.181..5464.948 rows=333333 loops=3) Output: sum(id), txt Group Key: gtest.txt Worker 0: actual time=3094.550..5486.992 rows=326082 loops=1 Worker 1: actual time=3099.562..5480.111 rows=324729 loops=1 -> Parallel Seq Scan on public.gtest (actual time=0.791..656.601 rows=3333333 loops=3) Output: id, txt Worker 0: actual time=0.080..646.053 rows=3057680 loops=1 Worker 1: actual time=0.070..662.754 rows=3034370 loops=1 Planning Time: 0.243 ms Execution Time: 5788.981 ms (15 rows) Time: 5790.143 ms (00:05.790) ``` grouping sets times ``` set enable_batch_hashagg = off; explain (costs off,analyze,verbose) select sum(id),txt from gtest group by grouping sets(id,txt,()); QUERY PLAN ------------------------------------------------------------------------------------------ GroupAggregate (actual time=9454.707..38921.885 rows=2000001 loops=1) Output: sum(id), txt, id Group Key: gtest.id Group Key: () Sort Key: gtest.txt Group Key: gtest.txt -> Sort (actual time=9454.679..11804.071 rows=10000000 loops=1) Output: txt, id Sort Key: gtest.id Sort Method: external merge Disk: 254056kB -> Seq Scan on public.gtest (actual time=2.250..2419.031 rows=10000000 loops=1) Output: txt, id Planning Time: 0.230 ms Execution Time: 39203.883 ms (14 rows) Time: 39205.339 ms (00:39.205) set enable_batch_hashagg = on; explain (costs off,analyze,verbose) select sum(id),txt from gtest group by grouping sets(id,txt,()); QUERY PLAN ------------------------------------------------------------------------------------------------- Gather (actual time=5931.776..14353.957 rows=2000001 loops=1) Output: (sum(id)), txt, id Workers Planned: 2 Workers Launched: 2 -> Parallel BatchHashAggregate (actual time=5920.963..13897.852 rows=666667 loops=3) Output: sum(id), txt, id Group Key: gtest.id Group Key: () Group Key: gtest.txt Worker 0: actual time=5916.370..14062.461 rows=513810 loops=1 Worker 1: actual time=5916.037..13932.847 rows=775901 loops=1 -> Parallel Seq Scan on public.gtest (actual time=0.399..688.273 rows=3333333 loops=3) Output: id, txt Worker 0: actual time=0.052..690.955 rows=3349990 loops=1 Worker 1: actual time=0.050..691.595 rows=3297070 loops=1 Planning Time: 0.157 ms Execution Time: 14598.416 ms (17 rows) Time: 14599.437 ms (00:14.599) ```