On Wed, Apr 11, 2018 at 2:34 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> That test is, therefore, wrong.  Otherwise, no non-builtin function
> could ever be marked parallel safe, for fear that the shlib it lives
> in might try to set up a custom variable at load time.

I don't follow that logic.  If the check is more stringent than it
needs to be for safety, then we can relax it.  If the check is less
stringent than it needs to be for safety, we need to tighten it.  If
it's exactly right, then it has to stay the way it is, even if that
prohibits some things we wish we could allow.

> I'm of the opinion that having such a test here at all is crummy design.
> It implies that set_config_option is in charge of knowing about every
> one of its callers and passing judgment on whether they represent
> parallel-safe usages, which is the exact opposite of modularity,
> even if set_config_option had enough information to make that judgment
> which it does not.

The point of SerializeGUCState and RestoreGUCState is to enforce an
invariant that all cooperating processes have the same idea about the
values of every GUC.  If you let the GUCs be changed, then isn't that
invariant is violated regardless of why you let it happen?  For

rhaas=# set pg_trgm.similarity_threshold = '002';
rhaas=# select current_setting('pg_trgm.similarity_threshold');
(1 row)

rhaas=# load 'pg_trgm';
WARNING:  2 is outside the valid range for parameter
"pg_trgm.similarity_threshold" (0 .. 1)
rhaas=# select current_setting('pg_trgm.similarity_threshold');
(1 row)

This shows that if you load a library that defines a custom variable
in process A but not process B, it's trivial to construct a query that
no longer returns the same answer in both backends, which isn't too
good, because the point of parallel query is to deliver the same
answer with parallel query that you would have gotten without it, or
at the very least, to make it look like the answer was generated by a
single process with a unified view of the world rather than multiple
uncoordinated processes.

I tried the following test with Jeff's example function and with
plperl.on_init='' and it does not produce a warning:

rhaas=# set force_parallel_mode = on;
rhaas=# select foobar('dsgsdg');
(1 row)

I haven't tracked it down, but I think what must be going on here is
that, when the function is called via a query, something triggers the
shared library to get loaded before we enter parallel mode.  Then
things work OK, because the worker will load the shared library before
restoring the leader's GUC state, and so the state is actually
guaranteed to match.  Here it will *probably* end up matching because
most likely every process that loads the module will interpret the
existing textual setting of the newly-defined GUC in the same way, but
it's a bit less ironclad, and if they decide to emit warnings (as in
my first example, above) then you'll get multiple copies of them.  So
I don't think just drilling a hole through the prohibition in
set_config_option() is a particularly appealing solution.  It would
amount to suppressing a warning that is basically legitimate.

I wonder if there's an easy way to force the libraries that we're
going to need to be loaded before we reach CreateParallelContext().

Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Reply via email to