Hello

I verified that the patch is generally faster in my benchmarks, with
one exception:
anti joins with heavy duplication end up being significantly slower,
for example:

create table ao (a int not null);
create table ai (k int not null);
insert into ao select g from generate_series(1,100000) g;
insert into ai select g % 50 from generate_series(1,2000000) g;
analyze ao;
analyze ai;
\timing on
explain (analyze, costs off, timing off, summary off)
select count(*) from ao where a not in (select distinct k from ai);

Which seems related to parallelization, as in these scenarios the
patched version chooses a serial execution compared to the
parallelized deduplication on master, and ends up being 2-4x slower.
If I force it to use parallel workers, it ends up being faster even in
these cases.


Reply via email to