so 21. 3. 2020 v 21:29 odesílatel Pavel Stehule <>

> so 21. 3. 2020 v 19:24 odesílatel Tom Lane <> napsal:
>> Pavel Stehule <> writes:
>> > So the patch has a problem with constant casting - unfortunately the
>> mix of
>> > double precision variables and numeric constants is pretty often in
>> > Postgres.
>> Yeah.  I believe the cause of that is that the patch thinks it can skip
>> passing an inline-function-free simple expression through the planner.
>> That's flat out wrong.  Quite aside from failing to perform
>> constant-folding (which is presumably the cause of the slowdown you
>> spotted), that means that we miss performing such non-optional
>> transformations as rearranging named-function-argument notation into
>> positional order.  I didn't bother to test that but I'm sure it can be
>> shown to lead to crashes.
>> Now that I've looked at the patch I don't like it one bit from a
>> structural standpoint either.  It's basically trying to make an end
>> run around the plancache, which is not going to be maintainable even
>> if it correctly accounted for everything the plancache does today.
>> Which it doesn't.  Two big problems are:
>> * It doesn't account for the possibility of search_path changes
>> affecting the interpretation of an expression.
>> * It assumes that the *only* things that a simple plan could get
>> invalidated for are functions that were inlined.  This isn't the
>> case --- a counterexample is that removal of no-op CoerceToDomain
>> nodes requires the plan to be invalidated if the domain's constraints
>> change.  And there are likely to be more such issues in future.
>> So while there's clearly something worth pursuing here, I do not like
>> anything about the way it was done.  I think that the right way to
>> think about this problem is "how can the plan cache provide a fast
>> path for checking validity of simple-expression plans?".  And when you
>> think about it that way, there's a pretty obvious answer: if the plan
>> involves no table references, there's not going to be any locks that
>> have to be taken before we can check the is_valid flag.  So we can
>> have a fast path that skips AcquirePlannerLocks and
>> AcquireExecutorLocks, which are a big part of the problem, and we can
>> also bypass some of the other random checks that GetCachedPlan has to
>> make, like whether RLS affects the plan.
>> Another chunk of the issue is the constant acquisition and release of
>> reference counts on the plan.  We can't really skip that (I suspect
>> there are additional bugs in Amit's patch arising from trying to do so).
>> However, plpgsql already has mechanisms for paying simple-expression
>> setup costs once per transaction rather than once per expression use.
>> So we can set up a simple-expression ResourceOwner managed much like
>> the simple-expression EState, and have it hold a refcount on the
>> CachedPlan for each simple expression, and pay that overhead just once
>> per transaction.
>> So I worked on those ideas for awhile, and came up with the attached
>> patchset:
>> 0001 adds some regression tests in this area (Amit's patch fails the
>> tests concerning search_path changes).
>> 0002 does what's suggested above.  I also did a little bit of marginal
>> micro-tuning in exec_eval_simple_expr() itself.
>> 0003 improves the biggest remaining cost of validity rechecking,
>> which is verifying that the search_path is the same as it was when
>> the plan was cached.
>> I haven't done any serious performance testing on this, but it gives
>> circa 2X speedup on Pavel's original example, which is at least
>> fairly close to the results that Amit's patch got there.  And it
>> makes this last batch of test cases faster not slower, too.
>> With this patch, perf shows the hotspots on Pavel's original example
>> as being
>> +   19.24%    19.17%         46470  postmaster
>>          [.] exec_eval_expr
>> +   15.19%    15.15%         36720  postmaster
>>          [.] plpgsql_param_eval_var
>> +   14.98%    14.94%         36213  postmaster       postgres
>>          [.] ExecInterpExpr
>> +    6.32%     6.30%         15262  postmaster
>>          [.] exec_stmt
>> +    6.08%     6.06%         14681  postmaster
>>          [.] exec_assign_value
>> Maybe there's more that could be done to knock fat out of
>> exec_eval_expr and/or plpgsql_param_eval_var, but at least
>> the plan cache isn't the bottleneck anymore.
> I tested Tom's patches, and I can confirm these results.
> It doesn't break tests (and all tests plpgsql_check tests passed without
> problems).
> The high overhead of ExecInterpExpr is related to prepare fcinfo, and
> checking nulls arguments because all functions are strict
> plpgsql_param_eval_var, looks like expensive is var = (PLpgSQL_var *)
> estate->datums[dno] and *op->resvalue = var->value;

I rechecked Tom's patch, and all tests passed, and there is stable positive
performance impact about 30% in tested pi estimation example.

After this patch, the code is only 3x times slower than in Lua (originally
it was 5x) and 1/3 slower than in Python (but Python calculates with higher

I think so this speed is maximum what is possible (now) - after patching
the slower execution is assigned to nullable types and related operations.
Probably it can be reduced too. The variables can be marked as NOT NULL,
and if all variables are NOT NULL, then we don't need to repeat check of
null arguments of strict functions.

I'll mark this patch as ready for commiters.

Thank you


> It looks great.
> Pavel
>>                         regards, tom lane

Reply via email to