On 24-11-2020 01:44, Greg Nancarrow wrote:
On Tue, Nov 24, 2020 at 2:34 AM Luc Vlaming <l...@swarm64.com> wrote:
Hi,
For this problem there is a patch I created, which is registered under
https://commitfest.postgresql.org/30/2787/ that should fix this without
any workarounds. Maybe someone can take a look at it?
I tried your patch with the latest PG source code (24/11), but
unfortunately a non-parallel plan was still produced in this case.
test=# explain
select count(*)
from (select
n1
from drop_me
union all
values(1)) ua;
QUERY PLAN
--------------------------------------------------------------------------------
Aggregate (cost=1889383.54..1889383.55 rows=1 width=8)
-> Append (cost=0.00..1362834.03 rows=42123961 width=32)
-> Seq Scan on drop_me (cost=0.00..730974.60 rows=42123960 width=32)
-> Subquery Scan on "*SELECT* 2" (cost=0.00..0.02 rows=1 width=32)
-> Result (cost=0.00..0.01 rows=1 width=4)
(5 rows)
That's not to say your patch doesn't have merit - but maybe just not a
fix for this particular case.
As before, if the SQL is tweaked to align the types for the UNION, you
get a parallel plan:
test=# explain
select count(*)
from (select
n1
from drop_me
union all
values(1::numeric)) ua;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=821152.71..821152.72 rows=1 width=8)
-> Gather (cost=821152.50..821152.71 rows=2 width=8)
Workers Planned: 2
-> Partial Aggregate (cost=820152.50..820152.51 rows=1 width=8)
-> Parallel Append (cost=0.00..747235.71 rows=29166714
width=0)
-> Result (cost=0.00..0.01 rows=1 width=0)
-> Parallel Seq Scan on drop_me
(cost=0.00..601402.13 rows=29166713 width=0)
(7 rows)
Regards,
Greg Nancarrow
Fujitsu Australia
Hi,
You're completely right, sorry for my error. I was too quick on assuming
my patch would work for this specific case too; I should have tested
that before replying. It looked very similar but turns out to not work
because of the upper rel not being considered parallel.
I would like to extend my patch to support this, or create a second
patch. This would however be significantly more involved because it
would require that we (always?) consider two paths whenever we process a
subquery: the best parallel plan and the best serial plan. Before I
emback on such a journey I would like some input on whether this would
be a very bad idea. Thoughts?
Regards,
Luc
Swarm64