Hello,
I think we've found a few existing problems with handling the parallel safety
of functions while doing an experiment. Could I hear your opinions on what we
should do? I'd be willing to create and submit a patch to fix them.
The experiment is to add a parallel safety check in FunctionCallInvoke() and
run the regression test with force_parallel_mode=regress. The added check
errors out with ereport(ERROR) when the about-to-be-called function is parallel
unsafe and the process is currently in parallel mode. 6 test cases failed
because the following parallel-unsafe functions were called:
dsnowball_init
balkifnull
int44out
text_w_default_out
widget_out
The first function is created in src/backend/snowball/snowball_create.sql for
full text search. The remaining functions are created during the regression
test run.
The relevant issues follow.
(1)
All the above functions are actually parallel safe looking at their
implementations. It seems that their CREATE FUNCTION statements are just
missing PARALLEL SAFE specifications, so I think I'll add them.
dsnowball_lexize() may also be parallel safe.
(2)
I'm afraid the above phenomenon reveals that postgres overlooks parallel safety
checks in some places. Specifically, we noticed the following:
* User-defined aggregate
CREATE AGGREGATE allows to specify parallel safety of the aggregate itself and
the planner checks it, but the support function of the aggregate is not
checked. OTOH, the document clearly says:
https://www.postgresql.org/docs/devel/xaggr.html
"Worth noting also is that for an aggregate to be executed in parallel, the
aggregate itself must be marked PARALLEL SAFE. The parallel-safety markings on
its support functions are not consulted."
https://www.postgresql.org/docs/devel/sql-createaggregate.html
"An aggregate will not be considered for parallelization if it is marked
PARALLEL UNSAFE (which is the default!) or PARALLEL RESTRICTED. Note that the
parallel-safety markings of the aggregate's support functions are not consulted
by the planner, only the marking of the aggregate itself."
Can we check the parallel safety of aggregate support functions during
statement execution and error out? Is there any reason not to do so?
* User-defined data type
The input, output, send,receive, and other functions of a UDT are not checked
for parallel safety. Is there any good reason to not check them other than the
concern about performance?
* Functions for full text search
Should CREATE TEXT SEARCH TEMPLATE ensure that the functions are parallel safe?
(Those functions could be changed to parallel unsafe later with ALTER
FUNCTION, though.)
(3) Built-in UDFs are not checked for parallel safety
The functions defined in fmgr_builtins[], which are derived from pg_proc.dat,
are not checked. Most of them are marked parallel safe, but some are paralel
unsaferestricted.
Besides, changing their parallel safety with ALTER FUNCTION PARALLEL does not
affect the selection of query plan. This is because fmgr_builtins[] does not
have a member for parallel safety.
Should we add a member for parallel safety in fmgr_builtins[], and disallow
ALTER FUNCTION to change the parallel safety of builtin UDFs?
Regards
Takayuki Tsunakawa