Re: plan cache overhead on plpgsql expression

2020-03-27 Thread Tom Lane
I wrote:
> Amit Langote  writes:
>> One thing -- I don't get the division between
>> CachedPlanAllowsSimpleValidityCheck() and CachedPlanIsSimplyValid().
>> Maybe I am missing something, but could there not be just one
>> function, possibly using whether expr_simple_expr is set or not to
>> skip or do, resp., the checks that the former does?

> Well, we don't want to do the initial checks over again every time;
> we want the is-valid test to be as simple and fast as we can make it.
> I suppose we could have one function with a boolean flag saying "this is a
> recheck", but I don't find that idea to be any better than the way it is.

So after looking at the buildfarm results, I think you were on to
something.  The initial and recheck conditions actually have to be
a bit different, and the reason is that immediately after GetCachedPlan
has produced a plan, it's possible for plansource->is_valid to be false
even though the derived plan is marked valid.  (In the buildfarm, this
is happening because of CLOBBER_CACHE_ALWAYS or equivalent cache flushes;
in the real world it'd probably require sinval queue overflow to happen
while building the plan.)

What we want in this situation is to go ahead and use the derived plan,
and then rebuild next time; that's what the pre-existing code did and
it's really the only reasonable answer.  It might seem better to go
back and try to rebuild the plan right away, but that'd be an infinite
loop in a CLOBBER_CACHE_ALWAYS build.  Also, if we fail to use the
derived plan at all, that amounts to disabling the "simple expression"
optimization as a result of a chance sinval overflow.  That's bad from
a performance standpoint and it will also cause regression test output
changes (since, as you previously discovered, the simple-expression
path produces different CONTEXT messages for error cases --- maybe we
should change that, but I don't want to be forced into it).

The existing code structure can't support doing it like that, so we have
to refactor to make the initial check and the recheck be separate code.
Working on a patch for that now.

regards, tom lane




Re: plan cache overhead on plpgsql expression

2020-03-26 Thread Tom Lane
Andres Freund  writes:
> On 2020-03-26 14:37:59 -0400, Tom Lane wrote:
>> Testing that reminded me of the other regression test failure I'd seen
>> when I first tried to do it: select_parallel.sql shows a WARNING about
>> a plancache leak in a parallel worker process.  When I looked into the
>> reason for that, it turned out that some cowboy has split
>> XACT_EVENT_COMMIT into XACT_EVENT_COMMIT and
>> XACT_EVENT_PARALLEL_COMMIT (where the latter is used in parallel
>> workers) without bothering to fix the collateral damage to plpgsql.
>> So plpgsql_xact_cb isn't doing any cleanup in parallel workers, and
>> hasn't been for a couple of releases.
>> The bad effects of that are probably limited given that the worker
>> process will exit after committing, but I still think that that part
>> of this patch is a bug fix that needs to be back-patched.

> Ugh. Lucky that we don't register many things inside those resowners.

Yeah.  I spent some time trying to produce a failure this way, and
concluded that it's pretty hard because most of the relevant callbacks
will be run during ExprContext shutdown, which is done during plpgsql
function exit.  In a non-transaction-abort situation, the simple EState
shouldn't have any live ExprContexts left at commit.  I did find a case
where a memory context callback attached to the EState's query context
doesn't get run when expected ... but it still gets run later, when the
whole memory context tree is destroyed.  So I can't demonstrate any
user-visible misbehavior in the core code.  But it still seems like a
prudent idea to back-patch a fix, in case I missed something or there is
some extension that's pushing the boundaries further.  It's definitely
not very cool that we're leaving behind a dangling static pointer to an
EState that won't exist once TopTransactionMemoryContext is gone.

I'll back-patch relevant parts of those comments about DO block
management, too.

regards, tom lane




Re: plan cache overhead on plpgsql expression

2020-03-26 Thread Tom Lane
Andres Freund  writes:
> On 2020-03-26 14:37:59 -0400, Tom Lane wrote:
>> + * This function, together with CachedPlanIsSimplyValid, provides a fast 
>> path
>> + * for revalidating "simple" generic plans.  The core requirement to be 
>> simple
>> + * is that the plan must not require taking any locks, which translates to
>> + * not touching any tables; this happens to match up well with an important
>> + * use-case in PL/pgSQL.

> Hm - is there currently sufficient guarantee that we absorb sinval
> messages? Would still matter for types, functions, etc?

There are potentially issues of that sort throughout the backend, not
just here, since we don't have any locking on types or functions.
I don't think it's this patch's job to address that.  In practice
I think we've thought about it and concluded that the cost/benefit
of introducing such locks just isn't promising:

* Generally speaking you can't do anything very interesting to a type
anyway, at least not with supported DDL.  The worst-case situation that
could materialize AFAIK is possibly evaluating slightly-stale constraints
for a domain.  (The typcache does have sinval invalidation for those
constraints, but I don't recall offhand how much we guarantee about
how quickly we'll notice updates.)

* For functions, you might execute a somewhat stale version of the
function body.  The bad effects there are pretty limited since a function
is defined by just one catalog row, unlike tables, so you can't see a
self-inconsistent version of it.

The amount of lock overhead that it would take to remove those edge
cases seems slightly staggering, so I doubt we'd ever do it.

> While it'd do a small bit unnecessary work, I do wonder if it'd be
> better to use this code in ResourceOwnereReleaseInternal().

When and if we refactor to expose this sort of thing more generally,
it might be worth doing it like that.  I don't think it helps much
right now.

> Perhaps add a reference to the new (appreciated, btw) DO comment above?

Can do.

Again, thanks for reviewing!

regards, tom lane




Re: plan cache overhead on plpgsql expression

2020-03-26 Thread Andres Freund
Hi,

On 2020-03-26 14:37:59 -0400, Tom Lane wrote:
> I wrote:
> > I had a thought about a possibly-cleaner way to do this.  We could invent
> > a resowner function, say ResourceOwnerReleaseAllPlanCacheRefs, that
> > explicitly releases all plancache pins it knows about.  So plpgsql
> > would not call the regular ResourceOwnerRelease entry point at all,
> > but just call that and then ResourceOwnerDelete, again relying on the
> > assertions therein to catch anything not released.
> 
> Here's a version that does it like that.  This does seem marginally
> nicer than the other way.  I have a feeling that at some point we'll
> want to expose resowners' contents more generally, but I'm not quite
> sure what the use-cases will be, so I don't want to design that now.

Yea, agreed with all of what you said in that paragraph.


> Testing that reminded me of the other regression test failure I'd seen
> when I first tried to do it: select_parallel.sql shows a WARNING about
> a plancache leak in a parallel worker process.  When I looked into the
> reason for that, it turned out that some cowboy has split
> XACT_EVENT_COMMIT into XACT_EVENT_COMMIT and
> XACT_EVENT_PARALLEL_COMMIT (where the latter is used in parallel
> workers) without bothering to fix the collateral damage to plpgsql.
> So plpgsql_xact_cb isn't doing any cleanup in parallel workers, and
> hasn't been for a couple of releases.

Ugh.


> The bad effects of that are probably limited given that the worker
> process will exit after committing, but I still think that that part
> of this patch is a bug fix that needs to be back-patched.

Ugh. Lucky that we don't register many things inside those resowners.


> (Just
> looking at what FreeExecutorState does, I wonder whether
> jit_release_context has any side-effects that are visible outside the
> process?  But I bet I can make a test case that shows issues even
> without JIT, based on the failure to call ExprContext shutdown
> callbacks.)

JIT doesn't currently have side-effects outside of the process. I really
want to add caching support, which'd presumably have problems due to
this, but it's not there yet... This could lead to leaking a fair bit of
memory over time otherwise.



>  /*
> + * CachedPlanAllowsSimpleValidityCheck: can we use CachedPlanIsSimplyValid?
> + *
> + * This function, together with CachedPlanIsSimplyValid, provides a fast path
> + * for revalidating "simple" generic plans.  The core requirement to be 
> simple
> + * is that the plan must not require taking any locks, which translates to
> + * not touching any tables; this happens to match up well with an important
> + * use-case in PL/pgSQL.

Hm - is there currently sufficient guarantee that we absorb sinval
messages? Would still matter for types, functions, etc?


>  /*
> + * ResourceOwnerReleaseAllPlanCacheRefs
> + *   Release the plancache references (only) held by this owner.
> + *
> + * We might eventually add similar functions for other resource types,
> + * but for now, only this is needed.
> + */
> +void
> +ResourceOwnerReleaseAllPlanCacheRefs(ResourceOwner owner)
> +{
> + ResourceOwner save;
> + Datum   foundres;
> +
> + save = CurrentResourceOwner;
> + CurrentResourceOwner = owner;
> + while (ResourceArrayGetAny(&(owner->planrefarr), ))
> + {
> + CachedPlan *res = (CachedPlan *) DatumGetPointer(foundres);
> +
> + ReleaseCachedPlan(res, true);
> + }
> + CurrentResourceOwner = save;
> +}

While it'd do a small bit unnecessary work, I do wonder if it'd be
better to use this code in ResourceOwnereReleaseInternal().


> --- a/src/pl/plpgsql/src/pl_exec.c
> +++ b/src/pl/plpgsql/src/pl_exec.c
> @@ -84,6 +84,13 @@ typedef struct
>   * has its own simple-expression EState, which is cleaned up at exit from
>   * plpgsql_inline_handler().  DO blocks still use the simple_econtext_stack,
>   * though, so that subxact abort cleanup does the right thing.
> + *
> + * (However, if a DO block executes COMMIT or ROLLBACK, then exec_stmt_commit
> + * or exec_stmt_rollback will unlink it from the DO's simple-expression 
> EState
> + * and create a new shared EState that will be used thenceforth.  The 
> original
> + * EState will be cleaned up when we get back to plpgsql_inline_handler.  
> This
> + * is a bit ugly, but it isn't worth doing better, since scenarios like this
> + * can't result in indefinite accumulation of state trees.)
>   */
>  typedef struct SimpleEcontextStackEntry
>  {
> @@ -96,6 +103,15 @@ static EState *shared_simple_eval_estate = NULL;
>  static SimpleEcontextStackEntry *simple_econtext_stack = NULL;
>  
>  /*
> + * In addition to the shared simple-eval EState, we have a shared resource
> + * owner that holds refcounts on the CachedPlans for any "simple" expressions
> + * we have evaluated in the current transaction.  This allows us to avoid
> + * continually grabbing and releasing a plan refcount when a simple 
> expression
> + * is used over and 

Re: plan cache overhead on plpgsql expression

2020-03-26 Thread Tom Lane
I wrote:
> I had a thought about a possibly-cleaner way to do this.  We could invent
> a resowner function, say ResourceOwnerReleaseAllPlanCacheRefs, that
> explicitly releases all plancache pins it knows about.  So plpgsql
> would not call the regular ResourceOwnerRelease entry point at all,
> but just call that and then ResourceOwnerDelete, again relying on the
> assertions therein to catch anything not released.

Here's a version that does it like that.  This does seem marginally
nicer than the other way.  I have a feeling that at some point we'll
want to expose resowners' contents more generally, but I'm not quite
sure what the use-cases will be, so I don't want to design that now.

Also, I studied the question of DO blocks' eval_estate + resowner
more carefully, and eventually concluded that the way it's being
done is okay --- it doesn't leak memory, as I'd first suspected.
But it's surely underdocumented, so I added some comments about it.
I also concluded as part of that study that it's probably best if
we *do* make the resowner parentage different in the two cases
after all.  So this has the "shared" resowner as a child of
TopTransactionResourceOwner after all (which means we don't need
to delete it explicitly), while a DO block's private resowner is
standalone (so it needs an explicit deletion).

Testing that reminded me of the other regression test failure I'd seen
when I first tried to do it: select_parallel.sql shows a WARNING about
a plancache leak in a parallel worker process.  When I looked into the
reason for that, it turned out that some cowboy has split
XACT_EVENT_COMMIT into XACT_EVENT_COMMIT and
XACT_EVENT_PARALLEL_COMMIT (where the latter is used in parallel
workers) without bothering to fix the collateral damage to plpgsql.
So plpgsql_xact_cb isn't doing any cleanup in parallel workers, and
hasn't been for a couple of releases.  The bad effects of that are
probably limited given that the worker process will exit after
committing, but I still think that that part of this patch is a bug
fix that needs to be back-patched.  (Just looking at what
FreeExecutorState does, I wonder whether jit_release_context has any
side-effects that are visible outside the process?  But I bet I can
make a test case that shows issues even without JIT, based on the
failure to call ExprContext shutdown callbacks.)

Anyway, I think this is committable now.

regards, tom lane

diff --git a/src/pl/plpgsql/src/Makefile b/src/pl/plpgsql/src/Makefile
index 70a9c34..193df8a 100644
--- a/src/pl/plpgsql/src/Makefile
+++ b/src/pl/plpgsql/src/Makefile
@@ -33,8 +33,8 @@ DATA = plpgsql.control plpgsql--1.0.sql
 REGRESS_OPTS = --dbname=$(PL_TESTDB)
 
 REGRESS = plpgsql_call plpgsql_control plpgsql_copy plpgsql_domain \
-	plpgsql_record plpgsql_cache plpgsql_transaction plpgsql_trap \
-	plpgsql_trigger plpgsql_varprops
+	plpgsql_record plpgsql_cache plpgsql_simple plpgsql_transaction \
+	plpgsql_trap plpgsql_trigger plpgsql_varprops
 
 # where to find gen_keywordlist.pl and subsidiary files
 TOOLSDIR = $(top_srcdir)/src/tools
diff --git a/src/pl/plpgsql/src/expected/plpgsql_simple.out b/src/pl/plpgsql/src/expected/plpgsql_simple.out
new file mode 100644
index 000..5a2fefa
--- /dev/null
+++ b/src/pl/plpgsql/src/expected/plpgsql_simple.out
@@ -0,0 +1,68 @@
+--
+-- Tests for plpgsql's handling of "simple" expressions
+--
+-- Check that changes to an inline-able function are handled correctly
+create function simplesql(int) returns int language sql
+as 'select $1';
+create function simplecaller() returns int language plpgsql
+as $$
+declare
+  sum int := 0;
+begin
+  for n in 1..10 loop
+sum := sum + simplesql(n);
+if n = 5 then
+  create or replace function simplesql(int) returns int language sql
+  as 'select $1 + 100';
+end if;
+  end loop;
+  return sum;
+end$$;
+select simplecaller();
+ simplecaller 
+--
+  555
+(1 row)
+
+-- Check that changes in search path are dealt with correctly
+create schema simple1;
+create function simple1.simpletarget(int) returns int language plpgsql
+as $$begin return $1; end$$;
+create function simpletarget(int) returns int language plpgsql
+as $$begin return $1 + 100; end$$;
+create or replace function simplecaller() returns int language plpgsql
+as $$
+declare
+  sum int := 0;
+begin
+  for n in 1..10 loop
+sum := sum + simpletarget(n);
+if n = 5 then
+  set local search_path = 'simple1';
+end if;
+  end loop;
+  return sum;
+end$$;
+select simplecaller();
+ simplecaller 
+--
+  555
+(1 row)
+
+-- try it with non-volatile functions, too
+alter function simple1.simpletarget(int) immutable;
+alter function simpletarget(int) immutable;
+select simplecaller();
+ simplecaller 
+--
+  555
+(1 row)
+
+-- make sure flushing local caches changes nothing
+\c -
+select simplecaller();
+ simplecaller 
+--
+  555
+(1 row)
+
diff --git 

Re: plan cache overhead on plpgsql expression

2020-03-26 Thread Tom Lane
Amit Langote  writes:
> One thing -- I don't get the division between
> CachedPlanAllowsSimpleValidityCheck() and CachedPlanIsSimplyValid().
> Maybe I am missing something, but could there not be just one
> function, possibly using whether expr_simple_expr is set or not to
> skip or do, resp., the checks that the former does?

Well, we don't want to do the initial checks over again every time;
we want the is-valid test to be as simple and fast as we can make it.
I suppose we could have one function with a boolean flag saying "this is a
recheck", but I don't find that idea to be any better than the way it is.

Also, although the existing structure in plpgsql always calls
CachedPlanIsSimplyValid immediately after a successful call to
CachedPlanAllowsSimpleValidityCheck, I don't think that's necessarily
going to be true for other potential users of the functions.
So merging the functions would reduce flexibility.

regards, tom lane




Re: plan cache overhead on plpgsql expression

2020-03-26 Thread Tom Lane
Andres Freund  writes:
> On 2020-03-25 17:51:50 -0400, Tom Lane wrote:
>> Andres Freund  writes:
>>> Hm, any chance that the multiple resowner calls here could show up in a
>>> profile? Probably not?

>> Doubt it.  On the other hand, as the code stands it's certain that the
>> resowner contains nothing but plancache pins (while I was writing the
>> patch it wasn't entirely clear that that would hold).  So we could
>> drop the two unnecessary calls.  There are assertions in
>> ResourceOwnerDelete that would fire if we somehow missed releasing
>> anything, so it doesn't seem like much of a maintenance hazard.

> One could even argue that that's a nice crosscheck: Due to the later
> release it'd not actually be correct to just add "arbitrary" things to
> that resowner.

I had a thought about a possibly-cleaner way to do this.  We could invent
a resowner function, say ResourceOwnerReleaseAllPlanCacheRefs, that
explicitly releases all plancache pins it knows about.  So plpgsql
would not call the regular ResourceOwnerRelease entry point at all,
but just call that and then ResourceOwnerDelete, again relying on the
assertions therein to catch anything not released.

This would be slightly more code but it'd perhaps make it clearer
what's going on, without the cost of a duplicative data structure.
Perhaps in future there'd be use for similar calls for other resource
types.

regards, tom lane




Re: plan cache overhead on plpgsql expression

2020-03-26 Thread Amit Langote
On Thu, Mar 26, 2020 at 4:44 AM Tom Lane  wrote:
>
> Pavel Stehule  writes:
> > I'll mark this patch as ready for commiters.
>
> Thanks for reviewing!  Amit, do you have any thoughts on this?

Thanks for picking this up.  Test cases added by your patch really
shows why the plancache and the planner must not be skipped, something
I totally failed to grasp.

I can't really see any problem with your patch, but mainly due to my
unfamiliarity with some of the more complicated things it touches,
like resowner stuff.

One thing -- I don't get the division between
CachedPlanAllowsSimpleValidityCheck() and CachedPlanIsSimplyValid().
Maybe I am missing something, but could there not be just one
function, possibly using whether expr_simple_expr is set or not to
skip or do, resp., the checks that the former does?

--
Thank you,

Amit Langote
EnterpriseDB: http://www.enterprisedb.com




Re: plan cache overhead on plpgsql expression

2020-03-25 Thread Tom Lane
Andres Freund  writes:
> I'm still confused by the comment I was reacting to - the code
> explicitly is about creating the *shared* resowner:

Right, this is because of the choice I mentioned earlier about creating
this resowner the same way as the one for the inline case.  I guess the
comments could go into more detail.  Or we could make the parentage
different for the two cases, but I don't like that much.

regards, tom lane




Re: plan cache overhead on plpgsql expression

2020-03-25 Thread Andres Freund
Hi,

On 2020-03-25 19:15:28 -0400, Tom Lane wrote:
> >> The comment is there because the regression tests fall over if you try
> >> to do it the other way :-(.  The failure I saw was specific to a
> >> transaction being done in a DO block, and maybe we could handle that
> >> differently from the case for a normal procedure; but it seemed better
> >> to me to make them the same.
> 
> > I'm still confused as to why it actually fixes the issue. Feel we should
> > at least understand what's going on before commtting.
> 
> I do understand the issue.  If you make the simple-resowner a child
> of TopTransactionResourceOwner, it vanishes at commit --- but
> plpgsql_inline_handler has still got a pointer to it, which it'll try
> to free afterwards, if the commit was inside the DO block.

I was confused why it fixes that, because:

>  void
>  plpgsql_xact_cb(XactEvent event, void *arg)
>  {
>   /*
>* If we are doing a clean transaction shutdown, free the EState (so 
> that
> -  * any remaining resources will be released correctly). In an abort, we
> +  * any remaining resources will be released correctly).  In an abort, we
>* expect the regular abort recovery procedures to release everything of
> -  * interest.
> +  * interest.  The resowner has to be explicitly released in both cases,
> +  * though, since it's not a child of TopTransactionResourceOwner.
>*/
>   if (event == XACT_EVENT_COMMIT || event == XACT_EVENT_PREPARE)
>   {
> @@ -8288,11 +8413,17 @@ plpgsql_xact_cb(XactEvent event, void *arg)
>   if (shared_simple_eval_estate)
>   FreeExecutorState(shared_simple_eval_estate);
>   shared_simple_eval_estate = NULL;
> + if (shared_simple_eval_resowner)
> + 
> plpgsql_free_simple_resowner(shared_simple_eval_resowner);
> + shared_simple_eval_resowner = NULL;
>   }
>   else if (event == XACT_EVENT_ABORT)
>   {
>   simple_econtext_stack = NULL;
>   shared_simple_eval_estate = NULL;
> + if (shared_simple_eval_resowner)
> + 
> plpgsql_free_simple_resowner(shared_simple_eval_resowner);
> + shared_simple_eval_resowner = NULL;
>   }
>  }

will lead to shared_simple_eval_resowner being deleted before
TopTransactionResourceOwner is deleted:

static void
CommitTransaction(void)
...
CallXactCallbacks(is_parallel_worker ? XACT_EVENT_PARALLEL_COMMIT
  : XACT_EVENT_COMMIT);

ResourceOwnerRelease(TopTransactionResourceOwner,
 RESOURCE_RELEASE_BEFORE_LOCKS,
 true, true);

What I missed is that the inline handler will not use
shared_simple_eval_resowner, but instead use the function local
simple_eval_resowner. Which I had not realized before.


I'm still confused by the comment I was reacting to - the code
explicitly is about creating the *shared* resowner:

> +  * Likewise for the simple-expression resource owner.  (Note: it'd be
> +  * safer to create this as a child of TopTransactionResourceOwner; but
> +  * right now that causes issues in transaction-spanning procedures, so
> +  * make it standalone.)
> +  */
> + if (estate->simple_eval_resowner == NULL)
> + {
> + if (shared_simple_eval_resowner == NULL)
> + shared_simple_eval_resowner =
> + ResourceOwnerCreate(NULL, "PL/pgSQL simple 
> expressions");
> + estate->simple_eval_resowner = shared_simple_eval_resowner;
> + }

which, afaict, will always deleted before TopTransactionResourceOwner
goes away?


Greetings,

Andres Freund




Re: plan cache overhead on plpgsql expression

2020-03-25 Thread Tom Lane
Andres Freund  writes:
> On 2020-03-25 17:51:50 -0400, Tom Lane wrote:
>> Perhaps, but I'm not sure that either of those functions represent
>> material overhead in cases besides this one.

> That's not huge, but also not nothing.

I see.  So maybe worth the trouble --- but still, seems like material for
a separate patch.

>>> Would it make sense to instead compute this as we go when building a
>>> valid CachedPlanSource?

>> I'm inclined to think not, because it'd just be overhead for other
>> users of cached plans.

> Even if we make RevalidateCachedQuery take advantage of the simpler
> tests when possible?

I'm not convinced that any real optimization is practical once you
allow tables in the query.  You then have to check the RLS-active
flags in some form, and the existing tests are not *that* expensive
as long as the answer is "no".  At best I think you might be reducing
two or three simple tests to one.

Also, the reason why this is interesting at all for plpgsql simple
expressions is that the cost of these checks, simple as they are,
is a noticeable fraction of the total time to do a simple expression.
That's not going to be the case for queries involving table access.

>> The comment is there because the regression tests fall over if you try
>> to do it the other way :-(.  The failure I saw was specific to a
>> transaction being done in a DO block, and maybe we could handle that
>> differently from the case for a normal procedure; but it seemed better
>> to me to make them the same.

> I'm still confused as to why it actually fixes the issue. Feel we should
> at least understand what's going on before commtting.

I do understand the issue.  If you make the simple-resowner a child
of TopTransactionResourceOwner, it vanishes at commit --- but
plpgsql_inline_handler has still got a pointer to it, which it'll try
to free afterwards, if the commit was inside the DO block.

What's not entirely clear to me is why this in exec_stmt_commit

@@ -4825,6 +4845,7 @@ exec_stmt_commit(PLpgSQL_execstate *estate, 
PLpgSQL_stmt_commit *stmt)
}
 
estate->simple_eval_estate = NULL;
+   estate->simple_eval_resowner = NULL;
plpgsql_create_econtext(estate);
 
return PLPGSQL_RC_OK;

is okay --- it avoids having a dangling pointer, sure, but if we're inside
a DO block won't plpgsql_create_econtext create a simple_eval_estate (and,
now, simple_eval_resowner) with the wrong properties?  But that's a
pre-existing question, and maybe Peter got it right and there's no
problem.

>> Doubt it.  On the other hand, as the code stands it's certain that the
>> resowner contains nothing but plancache pins (while I was writing the
>> patch it wasn't entirely clear that that would hold).  So we could
>> drop the two unnecessary calls.  There are assertions in
>> ResourceOwnerDelete that would fire if we somehow missed releasing
>> anything, so it doesn't seem like much of a maintenance hazard.

> One could even argue that that's a nice crosscheck: Due to the later
> release it'd not actually be correct to just add "arbitrary" things to
> that resowner.

OK, I'll change that.

>> (1) Not given the existing set of uses of the push/pop capability, which
>> so far as I can see is *only* CREATE SCHEMA.

> I do recall that there were issues with SET search_path in functions
> causing noticable slowdowns...

Yeah, possibly that could be improved, but that seems outside the scope of
this patch.

>> (2) as this is written, it's totally unsafe for the generation counter
>> ever to back up; that risks false match detections later.

> I was just thinking of backing up the 'active generation' state. New
> generations would have to come from a separate 'next generation'
> counter.

Oh, I see.  Yeah, that could work, but there's no point until we have
push/pop calls that are actually interesting for performance.

regards, tom lane




Re: plan cache overhead on plpgsql expression

2020-03-25 Thread Andres Freund
Hi,

On 2020-03-25 17:51:50 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > I wonder if it'd make sense to store the locks needed for
> > AcquirePlannerLocks/AcquireExecutorLocks in a better form.
> 
> Perhaps, but I'm not sure that either of those functions represent
> material overhead in cases besides this one.

For pgbench -M prepared -S GetCachedPlan() and its children are 2.36% of
the time. 1.75% of the total is RevalidateCachedQuery(). 1.13% of that
in turn is LockAcquireExtended.

That's not huge, but also not nothing. And this includes client
roundtrips. So I assume it'd show up larger when executing actual
queries in a function, or when pipelining (which e.g. pgjdbc has on by
default).

If I to simple lookups from pgbench_accounts in a loop in plpgsql
GetCachedPlan() is 4.43% and the LockAcquireExtended()'s called from
within are 1.46%.

So it's plausible that making this a more generic optimization would be
worthwhile.


> > Would it make sense to instead compute this as we go when building a
> > valid CachedPlanSource? If we make it a property of a is_valid
> > CachedPlanSource, we can assert that the plan is safe for use in
> > CachedPlanIsSimplyValid().
> 
> I'm inclined to think not, because it'd just be overhead for other
> users of cached plans.

Even if we make RevalidateCachedQuery take advantage of the simpler
tests when possible?  While there's plenty of cases where it'd not be
applicable, it seems likely that those wouldn't notice the small
slowdown either.



> >> /*
> >> +   * Likewise for the simple-expression resource owner.  (Note: it'd be
> >> +   * safer to create this as a child of TopTransactionResourceOwner; but
> >> +   * right now that causes issues in transaction-spanning procedures, so
> >> +   * make it standalone.)
> >> +   */
> 
> > Hm. I'm quite unfamiliar with this area of the code - so I'm likely just
> > missing something: Given that you're using a post xact cleanup hook to
> > release the resowner, I'm not quite sure I understand this comment. The
> > XACT_EVENT_ABORT/COMMIT callbacks are called before
> > TopTransactionResourceOwner is released, no?
> 
> The comment is there because the regression tests fall over if you try
> to do it the other way :-(.  The failure I saw was specific to a
> transaction being done in a DO block, and maybe we could handle that
> differently from the case for a normal procedure; but it seemed better
> to me to make them the same.

I'm still confused as to why it actually fixes the issue. Feel we should
at least understand what's going on before commtting.


> >> +void
> >> +plpgsql_free_simple_resowner(ResourceOwner simple_eval_resowner)
> >> +{
> >> +  /*
> >> +   * At this writing, the only thing that could actually get released is
> >> +   * plancache refcounts; but we may as well do the full release protocol.
> 
> > Hm, any chance that the multiple resowner calls here could show up in a
> > profile? Probably not?
> 
> Doubt it.  On the other hand, as the code stands it's certain that the
> resowner contains nothing but plancache pins (while I was writing the
> patch it wasn't entirely clear that that would hold).  So we could
> drop the two unnecessary calls.  There are assertions in
> ResourceOwnerDelete that would fire if we somehow missed releasing
> anything, so it doesn't seem like much of a maintenance hazard.

One could even argue that that's a nice crosscheck: Due to the later
release it'd not actually be correct to just add "arbitrary" things to
that resowner.


> > Could it be worth optimizing the path generation logic so that a
> > push/pop of an override path restores the old generation?
> 
> (1) Not given the existing set of uses of the push/pop capability, which
> so far as I can see is *only* CREATE SCHEMA.

Oh. Well, then that'd be something for later.

I do recall that there were issues with SET search_path in functions
causing noticable slowdowns...


> It's not involved in any other manipulations of the search path.  And
> (2) as this is written, it's totally unsafe for the generation counter
> ever to back up; that risks false match detections later.

I was just thinking of backing up the 'active generation' state. New
generations would have to come from a separate 'next generation'
counter.

Greetings,

Andres Freund




Re: plan cache overhead on plpgsql expression

2020-03-25 Thread Tom Lane
Andres Freund  writes:
> I wonder if it'd make sense to store the locks needed for
> AcquirePlannerLocks/AcquireExecutorLocks in a better form.

Perhaps, but I'm not sure that either of those functions represent
material overhead in cases besides this one.

> Would it make sense to instead compute this as we go when building a
> valid CachedPlanSource? If we make it a property of a is_valid
> CachedPlanSource, we can assert that the plan is safe for use in
> CachedPlanIsSimplyValid().

I'm inclined to think not, because it'd just be overhead for other
users of cached plans.

> That's mighty subtle :/

Yeah :-(.  I don't like it that much, but I don't see an easy way to
do better, given the way that plpgsql manages its simple expressions.

>> /*
>> + * Likewise for the simple-expression resource owner.  (Note: it'd be
>> + * safer to create this as a child of TopTransactionResourceOwner; but
>> + * right now that causes issues in transaction-spanning procedures, so
>> + * make it standalone.)
>> + */

> Hm. I'm quite unfamiliar with this area of the code - so I'm likely just
> missing something: Given that you're using a post xact cleanup hook to
> release the resowner, I'm not quite sure I understand this comment. The
> XACT_EVENT_ABORT/COMMIT callbacks are called before
> TopTransactionResourceOwner is released, no?

The comment is there because the regression tests fall over if you try
to do it the other way :-(.  The failure I saw was specific to a
transaction being done in a DO block, and maybe we could handle that
differently from the case for a normal procedure; but it seemed better
to me to make them the same.

There's a separate question lurking under there, which is whether the
existing management of the simple-expression EState is right at all
for transaction-spanning DO blocks; frankly it smells a bit fishy to
me.  But looking into that did not seem in-scope for this patch.

>> +void
>> +plpgsql_free_simple_resowner(ResourceOwner simple_eval_resowner)
>> +{
>> +/*
>> + * At this writing, the only thing that could actually get released is
>> + * plancache refcounts; but we may as well do the full release protocol.

> Hm, any chance that the multiple resowner calls here could show up in a
> profile? Probably not?

Doubt it.  On the other hand, as the code stands it's certain that the
resowner contains nothing but plancache pins (while I was writing the
patch it wasn't entirely clear that that would hold).  So we could
drop the two unnecessary calls.  There are assertions in
ResourceOwnerDelete that would fire if we somehow missed releasing
anything, so it doesn't seem like much of a maintenance hazard.

>> + * We pass isCommit = false even when committing, to suppress
>> + * resource-leakage gripes, since we aren't bothering to release the
>> + * refcounts one-by-one.
>> + */

> That's a bit icky...

Agreed, and it's not like our practice elsewhere.  I thought about adding
a data structure that would track the set of held plancache pins outside
the resowner, but concluded that that'd just be pointless duplicative
overhead.

> Could it be worth optimizing the path generation logic so that a
> push/pop of an override path restores the old generation?

(1) Not given the existing set of uses of the push/pop capability, which
so far as I can see is *only* CREATE SCHEMA.  It's not involved in any
other manipulations of the search path.  And (2) as this is written, it's
totally unsafe for the generation counter ever to back up; that risks
false match detections later.

I appreciate the review!

regards, tom lane




Re: plan cache overhead on plpgsql expression

2020-03-25 Thread Andres Freund
Hi,

On 2020-03-21 14:24:05 -0400, Tom Lane wrote:
> So while there's clearly something worth pursuing here, I do not like
> anything about the way it was done.  I think that the right way to
> think about this problem is "how can the plan cache provide a fast
> path for checking validity of simple-expression plans?".  And when you
> think about it that way, there's a pretty obvious answer: if the plan
> involves no table references, there's not going to be any locks that
> have to be taken before we can check the is_valid flag.  So we can
> have a fast path that skips AcquirePlannerLocks and
> AcquireExecutorLocks, which are a big part of the problem, and we can
> also bypass some of the other random checks that GetCachedPlan has to
> make, like whether RLS affects the plan.

That makes sense to me.

I wonder if it'd make sense to store the locks needed for
AcquirePlannerLocks/AcquireExecutorLocks in a better form. Not really
instead of your optimization, but to also address simple statements that
do reference a relation. If we stored all the locks for a plansource in
an array, it should get cheaper - and automatically implement the fast
path of skipping AcquirePlannerLocks/AcquireExecutorLocks when there's
no relations.


> Another chunk of the issue is the constant acquisition and release of
> reference counts on the plan.  We can't really skip that (I suspect
> there are additional bugs in Amit's patch arising from trying to do so).
> However, plpgsql already has mechanisms for paying simple-expression
> setup costs once per transaction rather than once per expression use.
> So we can set up a simple-expression ResourceOwner managed much like
> the simple-expression EState, and have it hold a refcount on the
> CachedPlan for each simple expression, and pay that overhead just once
> per transaction.


> I haven't done any serious performance testing on this, but it gives
> circa 2X speedup on Pavel's original example, which is at least
> fairly close to the results that Amit's patch got there.  And it
> makes this last batch of test cases faster not slower, too.
> 
> With this patch, perf shows the hotspots on Pavel's original example
> as being
> 
> +   19.24%19.17% 46470  postmaster   plpgsql.so   
> [.] exec_eval_expr
> +   15.19%15.15% 36720  postmaster   plpgsql.so   
> [.] plpgsql_param_eval_var
> +   14.98%14.94% 36213  postmaster   postgres 
> [.] ExecInterpExpr
> +6.32% 6.30% 15262  postmaster   plpgsql.so   
> [.] exec_stmt
> +6.08% 6.06% 14681  postmaster   plpgsql.so   
> [.] exec_assign_value
> 
> Maybe there's more that could be done to knock fat out of
> exec_eval_expr and/or plpgsql_param_eval_var, but at least
> the plan cache isn't the bottleneck anymore.

Nice!


> diff --git a/src/backend/utils/cache/plancache.c 
> b/src/backend/utils/cache/plancache.c
> index dbae18d..8e27b03 100644
> --- a/src/backend/utils/cache/plancache.c
> +++ b/src/backend/utils/cache/plancache.c
> @@ -1278,6 +1278,160 @@ ReleaseCachedPlan(CachedPlan *plan, bool useResOwner)
>  }
>  
>  /*
> + * CachedPlanAllowsSimpleValidityCheck: can we use CachedPlanIsSimplyValid?
> + *
> + * This function, together with CachedPlanIsSimplyValid, provides a fast path
> + * for revalidating "simple" generic plans.  The core requirement to be 
> simple
> + * is that the plan must not require taking any locks, which translates to
> + * not touching any tables; this happens to match up well with an important
> + * use-case in PL/pgSQL.  This function tests whether that's true, along
> + * with checking some other corner cases that we'd rather not bother with
> + * handling in the fast path.  (Note that it's still possible for such a plan
> + * to be invalidated, for example due to a change in a function that was
> + * inlined into the plan.)
> + *
> + * This must only be called on known-valid generic plans (eg, ones just
> + * returned by GetCachedPlan).  If it returns true, the caller may re-use
> + * the cached plan as long as CachedPlanIsSimplyValid returns true; that
> + * check is much cheaper than the full revalidation done by GetCachedPlan.
> + * Nonetheless, no required checks are omitted.
> + */
> +bool
> +CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
> + 
> CachedPlan *plan)
> +{
> + ListCell   *lc;

Would it make sense to instead compute this as we go when building a
valid CachedPlanSource? If we make it a property of a is_valid
CachedPlanSource, we can assert that the plan is safe for use in
CachedPlanIsSimplyValid().

And perhaps also optimize the normal checks in RevalidateCachedQuery()
for cases not going through the "simple" path. We could not use the
optimizations around refcounts for those, but it still seems like it
could be useful? And less separate 

Re: plan cache overhead on plpgsql expression

2020-03-25 Thread Robert Haas
On Sat, Mar 21, 2020 at 2:24 PM Tom Lane  wrote:
> With this patch, perf shows the hotspots on Pavel's original example
> as being
>
> +   19.24%19.17% 46470  postmaster   plpgsql.so   
> [.] exec_eval_expr
> +   15.19%15.15% 36720  postmaster   plpgsql.so   
> [.] plpgsql_param_eval_var
> +   14.98%14.94% 36213  postmaster   postgres 
> [.] ExecInterpExpr
> +6.32% 6.30% 15262  postmaster   plpgsql.so   
> [.] exec_stmt
> +6.08% 6.06% 14681  postmaster   plpgsql.so   
> [.] exec_assign_value

That's pretty sweet. As you say, there's probably some way to
eliminate some of the non-plancache overhead, but it's still a big
improvement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: plan cache overhead on plpgsql expression

2020-03-25 Thread Tom Lane
Pavel Stehule  writes:
> I'll mark this patch as ready for commiters.

Thanks for reviewing!  Amit, do you have any thoughts on this?

regards, tom lane




Re: plan cache overhead on plpgsql expression

2020-03-25 Thread Pavel Stehule
so 21. 3. 2020 v 21:29 odesílatel Pavel Stehule 
napsal:

>
>
> so 21. 3. 2020 v 19:24 odesílatel Tom Lane  napsal:
>
>> Pavel Stehule  writes:
>> > So the patch has a problem with constant casting - unfortunately the
>> mix of
>> > double precision variables and numeric constants is pretty often in
>> > Postgres.
>>
>> Yeah.  I believe the cause of that is that the patch thinks it can skip
>> passing an inline-function-free simple expression through the planner.
>> That's flat out wrong.  Quite aside from failing to perform
>> constant-folding (which is presumably the cause of the slowdown you
>> spotted), that means that we miss performing such non-optional
>> transformations as rearranging named-function-argument notation into
>> positional order.  I didn't bother to test that but I'm sure it can be
>> shown to lead to crashes.
>>
>> Now that I've looked at the patch I don't like it one bit from a
>> structural standpoint either.  It's basically trying to make an end
>> run around the plancache, which is not going to be maintainable even
>> if it correctly accounted for everything the plancache does today.
>> Which it doesn't.  Two big problems are:
>>
>> * It doesn't account for the possibility of search_path changes
>> affecting the interpretation of an expression.
>>
>> * It assumes that the *only* things that a simple plan could get
>> invalidated for are functions that were inlined.  This isn't the
>> case --- a counterexample is that removal of no-op CoerceToDomain
>> nodes requires the plan to be invalidated if the domain's constraints
>> change.  And there are likely to be more such issues in future.
>>
>>
>> So while there's clearly something worth pursuing here, I do not like
>> anything about the way it was done.  I think that the right way to
>> think about this problem is "how can the plan cache provide a fast
>> path for checking validity of simple-expression plans?".  And when you
>> think about it that way, there's a pretty obvious answer: if the plan
>> involves no table references, there's not going to be any locks that
>> have to be taken before we can check the is_valid flag.  So we can
>> have a fast path that skips AcquirePlannerLocks and
>> AcquireExecutorLocks, which are a big part of the problem, and we can
>> also bypass some of the other random checks that GetCachedPlan has to
>> make, like whether RLS affects the plan.
>>
>> Another chunk of the issue is the constant acquisition and release of
>> reference counts on the plan.  We can't really skip that (I suspect
>> there are additional bugs in Amit's patch arising from trying to do so).
>> However, plpgsql already has mechanisms for paying simple-expression
>> setup costs once per transaction rather than once per expression use.
>> So we can set up a simple-expression ResourceOwner managed much like
>> the simple-expression EState, and have it hold a refcount on the
>> CachedPlan for each simple expression, and pay that overhead just once
>> per transaction.
>>
>> So I worked on those ideas for awhile, and came up with the attached
>> patchset:
>>
>> 0001 adds some regression tests in this area (Amit's patch fails the
>> tests concerning search_path changes).
>>
>> 0002 does what's suggested above.  I also did a little bit of marginal
>> micro-tuning in exec_eval_simple_expr() itself.
>>
>> 0003 improves the biggest remaining cost of validity rechecking,
>> which is verifying that the search_path is the same as it was when
>> the plan was cached.
>>
>> I haven't done any serious performance testing on this, but it gives
>> circa 2X speedup on Pavel's original example, which is at least
>> fairly close to the results that Amit's patch got there.  And it
>> makes this last batch of test cases faster not slower, too.
>>
>> With this patch, perf shows the hotspots on Pavel's original example
>> as being
>>
>> +   19.24%19.17% 46470  postmaster   plpgsql.so
>>  [.] exec_eval_expr
>> +   15.19%15.15% 36720  postmaster   plpgsql.so
>>  [.] plpgsql_param_eval_var
>> +   14.98%14.94% 36213  postmaster   postgres
>>  [.] ExecInterpExpr
>> +6.32% 6.30% 15262  postmaster   plpgsql.so
>>  [.] exec_stmt
>> +6.08% 6.06% 14681  postmaster   plpgsql.so
>>  [.] exec_assign_value
>>
>> Maybe there's more that could be done to knock fat out of
>> exec_eval_expr and/or plpgsql_param_eval_var, but at least
>> the plan cache isn't the bottleneck anymore.
>>
>
> I tested Tom's patches, and I can confirm these results.
>
> It doesn't break tests (and all tests plpgsql_check tests passed without
> problems).
>
> The high overhead of ExecInterpExpr is related to prepare fcinfo, and
> checking nulls arguments because all functions are strict
> plpgsql_param_eval_var, looks like expensive is var = (PLpgSQL_var *)
> estate->datums[dno] and *op->resvalue = var->value;
>

I rechecked Tom's patch, and all tests passed, and there 

Re: plan cache overhead on plpgsql expression

2020-03-21 Thread Pavel Stehule
so 21. 3. 2020 v 19:24 odesílatel Tom Lane  napsal:

> Pavel Stehule  writes:
> > So the patch has a problem with constant casting - unfortunately the mix
> of
> > double precision variables and numeric constants is pretty often in
> > Postgres.
>
> Yeah.  I believe the cause of that is that the patch thinks it can skip
> passing an inline-function-free simple expression through the planner.
> That's flat out wrong.  Quite aside from failing to perform
> constant-folding (which is presumably the cause of the slowdown you
> spotted), that means that we miss performing such non-optional
> transformations as rearranging named-function-argument notation into
> positional order.  I didn't bother to test that but I'm sure it can be
> shown to lead to crashes.
>
> Now that I've looked at the patch I don't like it one bit from a
> structural standpoint either.  It's basically trying to make an end
> run around the plancache, which is not going to be maintainable even
> if it correctly accounted for everything the plancache does today.
> Which it doesn't.  Two big problems are:
>
> * It doesn't account for the possibility of search_path changes
> affecting the interpretation of an expression.
>
> * It assumes that the *only* things that a simple plan could get
> invalidated for are functions that were inlined.  This isn't the
> case --- a counterexample is that removal of no-op CoerceToDomain
> nodes requires the plan to be invalidated if the domain's constraints
> change.  And there are likely to be more such issues in future.
>
>
> So while there's clearly something worth pursuing here, I do not like
> anything about the way it was done.  I think that the right way to
> think about this problem is "how can the plan cache provide a fast
> path for checking validity of simple-expression plans?".  And when you
> think about it that way, there's a pretty obvious answer: if the plan
> involves no table references, there's not going to be any locks that
> have to be taken before we can check the is_valid flag.  So we can
> have a fast path that skips AcquirePlannerLocks and
> AcquireExecutorLocks, which are a big part of the problem, and we can
> also bypass some of the other random checks that GetCachedPlan has to
> make, like whether RLS affects the plan.
>
> Another chunk of the issue is the constant acquisition and release of
> reference counts on the plan.  We can't really skip that (I suspect
> there are additional bugs in Amit's patch arising from trying to do so).
> However, plpgsql already has mechanisms for paying simple-expression
> setup costs once per transaction rather than once per expression use.
> So we can set up a simple-expression ResourceOwner managed much like
> the simple-expression EState, and have it hold a refcount on the
> CachedPlan for each simple expression, and pay that overhead just once
> per transaction.
>
> So I worked on those ideas for awhile, and came up with the attached
> patchset:
>
> 0001 adds some regression tests in this area (Amit's patch fails the
> tests concerning search_path changes).
>
> 0002 does what's suggested above.  I also did a little bit of marginal
> micro-tuning in exec_eval_simple_expr() itself.
>
> 0003 improves the biggest remaining cost of validity rechecking,
> which is verifying that the search_path is the same as it was when
> the plan was cached.
>
> I haven't done any serious performance testing on this, but it gives
> circa 2X speedup on Pavel's original example, which is at least
> fairly close to the results that Amit's patch got there.  And it
> makes this last batch of test cases faster not slower, too.
>
> With this patch, perf shows the hotspots on Pavel's original example
> as being
>
> +   19.24%19.17% 46470  postmaster   plpgsql.so
>[.] exec_eval_expr
> +   15.19%15.15% 36720  postmaster   plpgsql.so
>[.] plpgsql_param_eval_var
> +   14.98%14.94% 36213  postmaster   postgres
>[.] ExecInterpExpr
> +6.32% 6.30% 15262  postmaster   plpgsql.so
>[.] exec_stmt
> +6.08% 6.06% 14681  postmaster   plpgsql.so
>[.] exec_assign_value
>
> Maybe there's more that could be done to knock fat out of
> exec_eval_expr and/or plpgsql_param_eval_var, but at least
> the plan cache isn't the bottleneck anymore.
>

I tested Tom's patches, and I can confirm these results.

It doesn't break tests (and all tests plpgsql_check tests passed without
problems).

The high overhead of ExecInterpExpr is related to prepare fcinfo, and
checking nulls arguments because all functions are strict
plpgsql_param_eval_var, looks like expensive is var = (PLpgSQL_var *)
estate->datums[dno] and *op->resvalue = var->value;

It looks great.

Pavel



>
> regards, tom lane
>
>


Re: plan cache overhead on plpgsql expression

2020-03-21 Thread Tom Lane
Pavel Stehule  writes:
> So the patch has a problem with constant casting - unfortunately the mix of
> double precision variables and numeric constants is pretty often in
> Postgres.

Yeah.  I believe the cause of that is that the patch thinks it can skip
passing an inline-function-free simple expression through the planner.
That's flat out wrong.  Quite aside from failing to perform
constant-folding (which is presumably the cause of the slowdown you
spotted), that means that we miss performing such non-optional
transformations as rearranging named-function-argument notation into
positional order.  I didn't bother to test that but I'm sure it can be
shown to lead to crashes.

Now that I've looked at the patch I don't like it one bit from a
structural standpoint either.  It's basically trying to make an end
run around the plancache, which is not going to be maintainable even
if it correctly accounted for everything the plancache does today.
Which it doesn't.  Two big problems are:

* It doesn't account for the possibility of search_path changes
affecting the interpretation of an expression.

* It assumes that the *only* things that a simple plan could get
invalidated for are functions that were inlined.  This isn't the
case --- a counterexample is that removal of no-op CoerceToDomain
nodes requires the plan to be invalidated if the domain's constraints
change.  And there are likely to be more such issues in future.


So while there's clearly something worth pursuing here, I do not like
anything about the way it was done.  I think that the right way to
think about this problem is "how can the plan cache provide a fast
path for checking validity of simple-expression plans?".  And when you
think about it that way, there's a pretty obvious answer: if the plan
involves no table references, there's not going to be any locks that
have to be taken before we can check the is_valid flag.  So we can
have a fast path that skips AcquirePlannerLocks and
AcquireExecutorLocks, which are a big part of the problem, and we can
also bypass some of the other random checks that GetCachedPlan has to
make, like whether RLS affects the plan.

Another chunk of the issue is the constant acquisition and release of
reference counts on the plan.  We can't really skip that (I suspect
there are additional bugs in Amit's patch arising from trying to do so).
However, plpgsql already has mechanisms for paying simple-expression
setup costs once per transaction rather than once per expression use.
So we can set up a simple-expression ResourceOwner managed much like
the simple-expression EState, and have it hold a refcount on the
CachedPlan for each simple expression, and pay that overhead just once
per transaction.

So I worked on those ideas for awhile, and came up with the attached
patchset:

0001 adds some regression tests in this area (Amit's patch fails the
tests concerning search_path changes).

0002 does what's suggested above.  I also did a little bit of marginal
micro-tuning in exec_eval_simple_expr() itself.

0003 improves the biggest remaining cost of validity rechecking,
which is verifying that the search_path is the same as it was when
the plan was cached.

I haven't done any serious performance testing on this, but it gives
circa 2X speedup on Pavel's original example, which is at least
fairly close to the results that Amit's patch got there.  And it
makes this last batch of test cases faster not slower, too.

With this patch, perf shows the hotspots on Pavel's original example
as being

+   19.24%19.17% 46470  postmaster   plpgsql.so 
  [.] exec_eval_expr
+   15.19%15.15% 36720  postmaster   plpgsql.so 
  [.] plpgsql_param_eval_var
+   14.98%14.94% 36213  postmaster   postgres   
  [.] ExecInterpExpr
+6.32% 6.30% 15262  postmaster   plpgsql.so 
  [.] exec_stmt
+6.08% 6.06% 14681  postmaster   plpgsql.so 
  [.] exec_assign_value

Maybe there's more that could be done to knock fat out of
exec_eval_expr and/or plpgsql_param_eval_var, but at least
the plan cache isn't the bottleneck anymore.

regards, tom lane

diff --git a/src/pl/plpgsql/src/Makefile b/src/pl/plpgsql/src/Makefile
index 70a9c34..193df8a 100644
--- a/src/pl/plpgsql/src/Makefile
+++ b/src/pl/plpgsql/src/Makefile
@@ -33,8 +33,8 @@ DATA = plpgsql.control plpgsql--1.0.sql
 REGRESS_OPTS = --dbname=$(PL_TESTDB)
 
 REGRESS = plpgsql_call plpgsql_control plpgsql_copy plpgsql_domain \
-	plpgsql_record plpgsql_cache plpgsql_transaction plpgsql_trap \
-	plpgsql_trigger plpgsql_varprops
+	plpgsql_record plpgsql_cache plpgsql_simple plpgsql_transaction \
+	plpgsql_trap plpgsql_trigger plpgsql_varprops
 
 # where to find gen_keywordlist.pl and subsidiary files
 TOOLSDIR = $(top_srcdir)/src/tools
diff --git a/src/pl/plpgsql/src/expected/plpgsql_simple.out 

Re: plan cache overhead on plpgsql expression

2020-03-20 Thread Pavel Stehule
Hi

I did another test

I use a pi estimation algorithm and it is little bit more realistic than
just almost empty cycle body - still probably nobody will calculate pi in
plpgsql.

CREATE OR REPLACE FUNCTION pi_est(n int)
RETURNS numeric AS $$
DECLARE
  accum double precision DEFAULT 1.0;
  c1 double precision DEFAULT 2.0;
  c2 double precision DEFAULT 1.0;
  v constant double precision DEFAULT 2.0;
BEGIN
  FOR i IN 1..n
  LOOP
accum := accum * ((c1 * c1) / (c2 * (c2 + v)));
c1 := c1 + v;
c2 := c2 + v;
  END LOOP;
  RETURN accum * v;
END;
$$ LANGUAGE plpgsql;

For this code the patch increased speed for 1000 iterations from 6.3
sec to 4.7  .. it is speedup about 25%

The best performance (28%) is with code

CREATE OR REPLACE FUNCTION pi_est_2(n int)
RETURNS numeric AS $$
DECLARE
  accum double precision DEFAULT 1.0;
  c1 double precision DEFAULT 2.0;
  c2 double precision DEFAULT 1.0;
BEGIN
  FOR i IN 1..n
  LOOP
accum := accum * ((c1 * c1) / (c2 * (c2 + double precision '2.0')));
c1 := c1 + double precision '2.0';
c2 := c2 + double precision '2.0';
  END LOOP;
  RETURN accum * double precision '2.0';
END;
$$ LANGUAGE plpgsql;

Unfortunately for unoptimized code the performance is worse (it is about
55% slower)

CREATE OR REPLACE FUNCTION pi_est_1(n int)
RETURNS numeric AS $$
DECLARE
  accum double precision DEFAULT 1.0;
  c1 double precision DEFAULT 2.0;
  c2 double precision DEFAULT 1.0;
BEGIN
  FOR i IN 1..n
  LOOP
accum := accum * ((c1 * c1) / (c2 * (c2 + 2.0)));
c1 := c1 + 2.0;
c2 := c2 + 2.0;
  END LOOP;
  RETURN accum * 2.0;
END;
$$ LANGUAGE plpgsql;

same performance (bad) is for explicit casting

CREATE OR REPLACE FUNCTION pi_est_3(n int)
RETURNS numeric AS $$
DECLARE
  accum double precision DEFAULT 1.0;
  c1 double precision DEFAULT 2.0;
  c2 double precision DEFAULT 1.0;
BEGIN
  FOR i IN 1..n
  LOOP
accum := accum * ((c1 * c1) / (c2 * (c2 + 2.0::double precision)));
c1 := c1 + 2.0::double precision;
c2 := c2 + 2.0::double precision;
  END LOOP;
  RETURN accum * double precision '2.0';
END;
$$ LANGUAGE plpgsql;

There is relative high overhead of cast from numeric init_var_from_num.

On master (without patching) the speed all double precision variants is
almost same.

This example can be reduced

CREATE OR REPLACE FUNCTION public.fx(integer)
 RETURNS double precision
 LANGUAGE plpgsql
AS $function$
DECLARE
  result double precision DEFAULT 1.0;
BEGIN
  FOR i IN 1..$1
  LOOP
result := result * 1.01::double precision;
  END LOOP;
  RETURN result;
END;
$function$

CREATE OR REPLACE FUNCTION public.fx_1(integer)
 RETURNS double precision
 LANGUAGE plpgsql
AS $function$
DECLARE
  result double precision DEFAULT 1.0;
BEGIN
  FOR i IN 1..$1
  LOOP
result := result * 1.01;
  END LOOP;
  RETURN result;
END;
$function$

CREATE OR REPLACE FUNCTION public.fx_2(integer)
 RETURNS double precision
 LANGUAGE plpgsql
AS $function$
DECLARE
  result double precision DEFAULT 1.0;
BEGIN
  FOR i IN 1..$1
  LOOP
result := result * double precision '1.01';
  END LOOP;
  RETURN result;
END;
$function$

Patched select fx(100) .. 400ms, fx_1 .. 400ms, fx_2  .. 126ms
Master fx(100) .. 180ms, fx_1 180 ms, fx_2 .. 180ms

So the patch has a problem with constant casting - unfortunately the mix of
double precision variables and numeric constants is pretty often in
Postgres.

Regards

Pavel


test.sql
Description: application/sql


Re: plan cache overhead on plpgsql expression

2020-03-20 Thread Pavel Stehule
čt 19. 3. 2020 v 10:47 odesílatel Amit Langote 
napsal:

> Hi Pavel,
>
> Sorry it took me a while to look at this.
>
> On Tue, Feb 25, 2020 at 4:28 AM Pavel Stehule 
> wrote:
> > po 24. 2. 2020 v 18:56 odesílatel Pavel Stehule 
> napsal:
> >> But I found one issue - I don't know if this issue is related to your
> patch or plpgsql_check.
> >>
> >> plpgsql_check try to clean after it was executed - it cleans all plans.
> But some pointers on simple expressions are broken after catched exceptions.
> >>
> >> expr->plan = 0x80. Is interesting, so other fields of this expressions
> are correct.
> >
> > I am not sure, but after patching the SPI_prepare_params the current
> memory context is some short memory context.
> >
> > Can SPI_prepare_params change current memory context? It did before. But
> after patching different memory context is active.
>
> I haven't been able to see the behavior you reported.  Could you let
> me know what unexpected memory context you see in the problematic
> case?
>

There was a problem with plpgsql_check after I applied this patch. It
crashed differently on own regress tests.

But I cannot to reproduce this issue now. Probably there was more issues
than one on my build environment.

So my questions and notes about a change of MemoryContext after patching
are messy. Sorry for noise.

Regards

Pavel



>
> --
> Thank you,
> Amit
>


Re: plan cache overhead on plpgsql expression

2020-03-19 Thread Pavel Stehule
čt 19. 3. 2020 v 10:47 odesílatel Amit Langote 
napsal:

> Hi Pavel,
>
> Sorry it took me a while to look at this.
>
> On Tue, Feb 25, 2020 at 4:28 AM Pavel Stehule 
> wrote:
> > po 24. 2. 2020 v 18:56 odesílatel Pavel Stehule 
> napsal:
> >> But I found one issue - I don't know if this issue is related to your
> patch or plpgsql_check.
> >>
> >> plpgsql_check try to clean after it was executed - it cleans all plans.
> But some pointers on simple expressions are broken after catched exceptions.
> >>
> >> expr->plan = 0x80. Is interesting, so other fields of this expressions
> are correct.
> >
> > I am not sure, but after patching the SPI_prepare_params the current
> memory context is some short memory context.
> >
> > Can SPI_prepare_params change current memory context? It did before. But
> after patching different memory context is active.
>
> I haven't been able to see the behavior you reported.  Could you let
> me know what unexpected memory context you see in the problematic
>
case?
>

How I can detect it? Are there some steps for debugging memory context?

Pavel

>
> --
> Thank you,
> Amit
>


Re: plan cache overhead on plpgsql expression

2020-03-19 Thread Amit Langote
Hi Pavel,

Sorry it took me a while to look at this.

On Tue, Feb 25, 2020 at 4:28 AM Pavel Stehule  wrote:
> po 24. 2. 2020 v 18:56 odesílatel Pavel Stehule  
> napsal:
>> But I found one issue - I don't know if this issue is related to your patch 
>> or plpgsql_check.
>>
>> plpgsql_check try to clean after it was executed - it cleans all plans. But 
>> some pointers on simple expressions are broken after catched exceptions.
>>
>> expr->plan = 0x80. Is interesting, so other fields of this expressions are 
>> correct.
>
> I am not sure, but after patching the SPI_prepare_params the current memory 
> context is some short memory context.
>
> Can SPI_prepare_params change current memory context? It did before. But 
> after patching different memory context is active.

I haven't been able to see the behavior you reported.  Could you let
me know what unexpected memory context you see in the problematic
case?

--
Thank you,
Amit




Re: plan cache overhead on plpgsql expression

2020-03-17 Thread Amit Langote
Hi David,

On Tue, Mar 17, 2020 at 8:53 PM David Steele  wrote:
>
> Hi Amit,
>
> On 2/25/20 3:42 AM, Amit Langote wrote:
> > On Tue, Feb 25, 2020 at 4:16 PM Pavel Stehule  
> > wrote:
> >> I added this patch to a commitfest
> >>
> >> https://commitfest.postgresql.org/27/2467/
> >>
> >> It is very interesting speedup and it is in good direction to JIT 
> >> expressions
> >
> > Thank you.  I was planning to do that myself.
> >
> > I will take a look at your other comments in a day or two.
>
> Do you know when you'll have chance to look at Pavel's comments?

Sorry, I had forgotten about this. I will try to post an update by Thursday.

-- 
Thank you,
Amit




Re: plan cache overhead on plpgsql expression

2020-03-17 Thread David Steele

Hi Amit,

On 2/25/20 3:42 AM, Amit Langote wrote:

On Tue, Feb 25, 2020 at 4:16 PM Pavel Stehule  wrote:

I added this patch to a commitfest

https://commitfest.postgresql.org/27/2467/

It is very interesting speedup and it is in good direction to JIT expressions


Thank you.  I was planning to do that myself.

I will take a look at your other comments in a day or two.


Do you know when you'll have chance to look at Pavel's comments?

Regards,
--
-David
da...@pgmasters.net




Re: plan cache overhead on plpgsql expression

2020-02-25 Thread Amit Langote
Hi Pavel,

On Tue, Feb 25, 2020 at 4:16 PM Pavel Stehule  wrote:
>
> Hi
>
> I added this patch to a commitfest
>
> https://commitfest.postgresql.org/27/2467/
>
> It is very interesting speedup and it is in good direction to JIT expressions

Thank you.  I was planning to do that myself.

I will take a look at your other comments in a day or two.

Thanks,
Amit




Re: plan cache overhead on plpgsql expression

2020-02-24 Thread Pavel Stehule
Hi

I added this patch to a commitfest

https://commitfest.postgresql.org/27/2467/

It is very interesting speedup and it is in good direction to JIT
expressions

Pavel


Re: plan cache overhead on plpgsql expression

2020-02-24 Thread Pavel Stehule
po 24. 2. 2020 v 18:56 odesílatel Pavel Stehule 
napsal:

>
>
> po 24. 2. 2020 v 18:47 odesílatel Pavel Stehule 
> napsal:
>
>>
>>
>> čt 20. 2. 2020 v 20:15 odesílatel Pavel Stehule 
>> napsal:
>>
>>>
>>>
>>> st 19. 2. 2020 v 8:09 odesílatel Amit Langote 
>>> napsal:
>>>
 On Wed, Feb 19, 2020 at 3:56 PM Amit Langote 
 wrote:
 > On Wed, Feb 19, 2020 at 3:38 PM Pavel Stehule <
 pavel.steh...@gmail.com> wrote:
 > > st 19. 2. 2020 v 7:30 odesílatel Pavel Stehule <
 pavel.steh...@gmail.com> napsal:
 > >> út 18. 2. 2020 v 17:08 odesílatel Amit Langote <
 amitlangot...@gmail.com> napsal:
 > >>> > I updated the patch to do that.
 > >>> >
 > >>> > With the new patch, `select foo()`, with inline-able sql_incr()
 in it,
 > >>> > runs in 679 ms.
 > >>> >
 > >>> > Without any inline-able function, it runs in 330 ms, whereas
 with
 > >>> > HEAD, it takes 590 ms.
 > >>>
 > >>> I polished it a bit.
 > >>
 > >>
 > >> the performance looks very interesting - on my comp the execution
 time of  1 iterations was decreased from 34 sec to 15 sec,
 > >>
 > >> So it is interesting speedup
 > >
 > > but regress tests fails
 >
 > Oops, I failed to check src/pl/plpgsql tests.
 >
 > Fixed in the attached.

 Added a regression test based on examples discussed here too.

>>>
>>> It is working without problems
>>>
>>> I think this patch is very interesting for Postgres 13
>>>
>>
>> I checked a performance of this patch again and I think so there is not
>> too much space for another optimization - maybe JIT can help.
>>
>> There is relative high overhead of call of strict functions - the params
>> are repeatedly tested against NULL.
>>
>
> But I found one issue - I don't know if this issue is related to your
> patch or plpgsql_check.
>
> plpgsql_check try to clean after it was executed - it cleans all plans.
> But some pointers on simple expressions are broken after catched exceptions.
>
> expr->plan = 0x80. Is interesting, so other fields of this expressions are
> correct.
>

I am not sure, but after patching the SPI_prepare_params the current memory
context is some short memory context.

Can SPI_prepare_params change current memory context? It did before. But
after patching different memory context is active.

Regards

Pavel


>
>
>
>
>> Regards
>>
>> Pavel
>>
>>
>>
>>> Regards
>>>
>>> Pavel
>>>

 Thanks,
 Amit

>>>


Re: plan cache overhead on plpgsql expression

2020-02-24 Thread Pavel Stehule
po 24. 2. 2020 v 18:47 odesílatel Pavel Stehule 
napsal:

>
>
> čt 20. 2. 2020 v 20:15 odesílatel Pavel Stehule 
> napsal:
>
>>
>>
>> st 19. 2. 2020 v 8:09 odesílatel Amit Langote 
>> napsal:
>>
>>> On Wed, Feb 19, 2020 at 3:56 PM Amit Langote 
>>> wrote:
>>> > On Wed, Feb 19, 2020 at 3:38 PM Pavel Stehule 
>>> wrote:
>>> > > st 19. 2. 2020 v 7:30 odesílatel Pavel Stehule <
>>> pavel.steh...@gmail.com> napsal:
>>> > >> út 18. 2. 2020 v 17:08 odesílatel Amit Langote <
>>> amitlangot...@gmail.com> napsal:
>>> > >>> > I updated the patch to do that.
>>> > >>> >
>>> > >>> > With the new patch, `select foo()`, with inline-able sql_incr()
>>> in it,
>>> > >>> > runs in 679 ms.
>>> > >>> >
>>> > >>> > Without any inline-able function, it runs in 330 ms, whereas with
>>> > >>> > HEAD, it takes 590 ms.
>>> > >>>
>>> > >>> I polished it a bit.
>>> > >>
>>> > >>
>>> > >> the performance looks very interesting - on my comp the execution
>>> time of  1 iterations was decreased from 34 sec to 15 sec,
>>> > >>
>>> > >> So it is interesting speedup
>>> > >
>>> > > but regress tests fails
>>> >
>>> > Oops, I failed to check src/pl/plpgsql tests.
>>> >
>>> > Fixed in the attached.
>>>
>>> Added a regression test based on examples discussed here too.
>>>
>>
>> It is working without problems
>>
>> I think this patch is very interesting for Postgres 13
>>
>
> I checked a performance of this patch again and I think so there is not
> too much space for another optimization - maybe JIT can help.
>
> There is relative high overhead of call of strict functions - the params
> are repeatedly tested against NULL.
>

But I found one issue - I don't know if this issue is related to your patch
or plpgsql_check.

plpgsql_check try to clean after it was executed - it cleans all plans. But
some pointers on simple expressions are broken after catched exceptions.

expr->plan = 0x80. Is interesting, so other fields of this expressions are
correct.





> Regards
>
> Pavel
>
>
>
>> Regards
>>
>> Pavel
>>
>>>
>>> Thanks,
>>> Amit
>>>
>>


Re: plan cache overhead on plpgsql expression

2020-02-24 Thread Pavel Stehule
čt 20. 2. 2020 v 20:15 odesílatel Pavel Stehule 
napsal:

>
>
> st 19. 2. 2020 v 8:09 odesílatel Amit Langote 
> napsal:
>
>> On Wed, Feb 19, 2020 at 3:56 PM Amit Langote 
>> wrote:
>> > On Wed, Feb 19, 2020 at 3:38 PM Pavel Stehule 
>> wrote:
>> > > st 19. 2. 2020 v 7:30 odesílatel Pavel Stehule <
>> pavel.steh...@gmail.com> napsal:
>> > >> út 18. 2. 2020 v 17:08 odesílatel Amit Langote <
>> amitlangot...@gmail.com> napsal:
>> > >>> > I updated the patch to do that.
>> > >>> >
>> > >>> > With the new patch, `select foo()`, with inline-able sql_incr()
>> in it,
>> > >>> > runs in 679 ms.
>> > >>> >
>> > >>> > Without any inline-able function, it runs in 330 ms, whereas with
>> > >>> > HEAD, it takes 590 ms.
>> > >>>
>> > >>> I polished it a bit.
>> > >>
>> > >>
>> > >> the performance looks very interesting - on my comp the execution
>> time of  1 iterations was decreased from 34 sec to 15 sec,
>> > >>
>> > >> So it is interesting speedup
>> > >
>> > > but regress tests fails
>> >
>> > Oops, I failed to check src/pl/plpgsql tests.
>> >
>> > Fixed in the attached.
>>
>> Added a regression test based on examples discussed here too.
>>
>
> It is working without problems
>
> I think this patch is very interesting for Postgres 13
>

I checked a performance of this patch again and I think so there is not too
much space for another optimization - maybe JIT can help.

There is relative high overhead of call of strict functions - the params
are repeatedly tested against NULL.

Regards

Pavel



> Regards
>
> Pavel
>
>>
>> Thanks,
>> Amit
>>
>


Re: plan cache overhead on plpgsql expression

2020-02-20 Thread Pavel Stehule
st 19. 2. 2020 v 8:09 odesílatel Amit Langote 
napsal:

> On Wed, Feb 19, 2020 at 3:56 PM Amit Langote 
> wrote:
> > On Wed, Feb 19, 2020 at 3:38 PM Pavel Stehule 
> wrote:
> > > st 19. 2. 2020 v 7:30 odesílatel Pavel Stehule <
> pavel.steh...@gmail.com> napsal:
> > >> út 18. 2. 2020 v 17:08 odesílatel Amit Langote <
> amitlangot...@gmail.com> napsal:
> > >>> > I updated the patch to do that.
> > >>> >
> > >>> > With the new patch, `select foo()`, with inline-able sql_incr() in
> it,
> > >>> > runs in 679 ms.
> > >>> >
> > >>> > Without any inline-able function, it runs in 330 ms, whereas with
> > >>> > HEAD, it takes 590 ms.
> > >>>
> > >>> I polished it a bit.
> > >>
> > >>
> > >> the performance looks very interesting - on my comp the execution
> time of  1 iterations was decreased from 34 sec to 15 sec,
> > >>
> > >> So it is interesting speedup
> > >
> > > but regress tests fails
> >
> > Oops, I failed to check src/pl/plpgsql tests.
> >
> > Fixed in the attached.
>
> Added a regression test based on examples discussed here too.
>

It is working without problems

I think this patch is very interesting for Postgres 13

Regards

Pavel

>
> Thanks,
> Amit
>


Re: plan cache overhead on plpgsql expression

2020-02-18 Thread Amit Langote
On Wed, Feb 19, 2020 at 3:56 PM Amit Langote  wrote:
> On Wed, Feb 19, 2020 at 3:38 PM Pavel Stehule  wrote:
> > st 19. 2. 2020 v 7:30 odesílatel Pavel Stehule  
> > napsal:
> >> út 18. 2. 2020 v 17:08 odesílatel Amit Langote  
> >> napsal:
> >>> > I updated the patch to do that.
> >>> >
> >>> > With the new patch, `select foo()`, with inline-able sql_incr() in it,
> >>> > runs in 679 ms.
> >>> >
> >>> > Without any inline-able function, it runs in 330 ms, whereas with
> >>> > HEAD, it takes 590 ms.
> >>>
> >>> I polished it a bit.
> >>
> >>
> >> the performance looks very interesting - on my comp the execution time of  
> >> 1 iterations was decreased from 34 sec to 15 sec,
> >>
> >> So it is interesting speedup
> >
> > but regress tests fails
>
> Oops, I failed to check src/pl/plpgsql tests.
>
> Fixed in the attached.

Added a regression test based on examples discussed here too.

Thanks,
Amit
diff --git a/src/backend/optimizer/util/clauses.c 
b/src/backend/optimizer/util/clauses.c
index 2d3ec22407..5ce0079a12 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -108,6 +108,7 @@ static bool 
contain_volatile_functions_not_nextval_walker(Node *node, void *cont
 static bool max_parallel_hazard_walker(Node *node,
   
max_parallel_hazard_context *context);
 static bool contain_nonstrict_functions_walker(Node *node, void *context);
+static bool contain_inlinable_functions_walker(Node *node, void *context);
 static bool contain_context_dependent_node(Node *clause);
 static bool contain_context_dependent_node_walker(Node *node, int *flags);
 static bool contain_leaked_vars_walker(Node *node, void *context);
@@ -1218,6 +1219,63 @@ contain_nonstrict_functions_walker(Node *node, void 
*context)
  context);
 }
 
+/*
+ * Check clauses for inline-able functions
+ */
+
+bool
+contain_inlinable_functions(Node *node)
+{
+   return contain_inlinable_functions_walker(node, NULL);
+}
+
+/*
+ * can_inline_function - checks if a function is inline-able
+ */
+static bool
+can_inline_function(HeapTuple func_tuple)
+{
+   Form_pg_proc funcform = (Form_pg_proc) GETSTRUCT(func_tuple);
+
+   /*
+* Nope if the function is not SQL-language or has other showstopper
+* properties.  (The prokind and nargs checks are just paranoia.)
+*/
+   return  funcform->prolang == SQLlanguageId &&
+   funcform->prokind == PROKIND_FUNCTION &&
+   !funcform->prosecdef && !funcform->proretset &&
+   funcform->prorettype != RECORDOID &&
+   heap_attisnull(func_tuple, Anum_pg_proc_proconfig, 
NULL);
+}
+
+static bool
+can_inline_function_checker(Oid funcid, void *context)
+{
+   HeapTuple   func_tuple;
+   boolresult;
+
+   func_tuple = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
+   if (!HeapTupleIsValid(func_tuple))
+   elog(ERROR, "cache lookup failed for function %u", funcid);
+
+   result = can_inline_function(func_tuple);
+   ReleaseSysCache(func_tuple);
+
+   return result;
+}
+
+static bool
+contain_inlinable_functions_walker(Node *node, void *context)
+{
+   if (node == NULL)
+   return false;
+   if (check_functions_in_node(node, can_inline_function_checker, context))
+   return true;
+
+   return expression_tree_walker(node, contain_inlinable_functions_walker,
+ context);
+}
+
 /*
  * Check clauses for context-dependent nodes
  */
@@ -4022,7 +4080,8 @@ simplify_function(Oid funcid, Oid result_type, int32 
result_typmod,
Assert(newexpr != (Expr *) );
}
 
-   if (!newexpr && allow_non_const)
+   if (!newexpr && allow_non_const &&
+   can_inline_function(func_tuple))
newexpr = inline_function(funcid, result_type, result_collid,
  input_collid, 
args, funcvariadic,
  func_tuple, 
context);
@@ -4415,16 +4474,11 @@ inline_function(Oid funcid, Oid result_type, Oid 
result_collid,
int i;
 
/*
-* Forget it if the function is not SQL-language or has other 
showstopper
-* properties.  (The prokind and nargs checks are just paranoia.)
+* Caller should already have checked whether the function can be 
inlined
+* using 

Re: plan cache overhead on plpgsql expression

2020-02-18 Thread Amit Langote
On Wed, Feb 19, 2020 at 3:38 PM Pavel Stehule  wrote:
> st 19. 2. 2020 v 7:30 odesílatel Pavel Stehule  
> napsal:
>> út 18. 2. 2020 v 17:08 odesílatel Amit Langote  
>> napsal:
>>> > I updated the patch to do that.
>>> >
>>> > With the new patch, `select foo()`, with inline-able sql_incr() in it,
>>> > runs in 679 ms.
>>> >
>>> > Without any inline-able function, it runs in 330 ms, whereas with
>>> > HEAD, it takes 590 ms.
>>>
>>> I polished it a bit.
>>
>>
>> the performance looks very interesting - on my comp the execution time of  
>> 1 iterations was decreased from 34 sec to 15 sec,
>>
>> So it is interesting speedup
>
> but regress tests fails

Oops, I failed to check src/pl/plpgsql tests.

Fixed in the attached.

Thanks,
Amit
diff --git a/src/backend/optimizer/util/clauses.c 
b/src/backend/optimizer/util/clauses.c
index 2d3ec22407..5ce0079a12 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -108,6 +108,7 @@ static bool 
contain_volatile_functions_not_nextval_walker(Node *node, void *cont
 static bool max_parallel_hazard_walker(Node *node,
   
max_parallel_hazard_context *context);
 static bool contain_nonstrict_functions_walker(Node *node, void *context);
+static bool contain_inlinable_functions_walker(Node *node, void *context);
 static bool contain_context_dependent_node(Node *clause);
 static bool contain_context_dependent_node_walker(Node *node, int *flags);
 static bool contain_leaked_vars_walker(Node *node, void *context);
@@ -1218,6 +1219,63 @@ contain_nonstrict_functions_walker(Node *node, void 
*context)
  context);
 }
 
+/*
+ * Check clauses for inline-able functions
+ */
+
+bool
+contain_inlinable_functions(Node *node)
+{
+   return contain_inlinable_functions_walker(node, NULL);
+}
+
+/*
+ * can_inline_function - checks if a function is inline-able
+ */
+static bool
+can_inline_function(HeapTuple func_tuple)
+{
+   Form_pg_proc funcform = (Form_pg_proc) GETSTRUCT(func_tuple);
+
+   /*
+* Nope if the function is not SQL-language or has other showstopper
+* properties.  (The prokind and nargs checks are just paranoia.)
+*/
+   return  funcform->prolang == SQLlanguageId &&
+   funcform->prokind == PROKIND_FUNCTION &&
+   !funcform->prosecdef && !funcform->proretset &&
+   funcform->prorettype != RECORDOID &&
+   heap_attisnull(func_tuple, Anum_pg_proc_proconfig, 
NULL);
+}
+
+static bool
+can_inline_function_checker(Oid funcid, void *context)
+{
+   HeapTuple   func_tuple;
+   boolresult;
+
+   func_tuple = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
+   if (!HeapTupleIsValid(func_tuple))
+   elog(ERROR, "cache lookup failed for function %u", funcid);
+
+   result = can_inline_function(func_tuple);
+   ReleaseSysCache(func_tuple);
+
+   return result;
+}
+
+static bool
+contain_inlinable_functions_walker(Node *node, void *context)
+{
+   if (node == NULL)
+   return false;
+   if (check_functions_in_node(node, can_inline_function_checker, context))
+   return true;
+
+   return expression_tree_walker(node, contain_inlinable_functions_walker,
+ context);
+}
+
 /*
  * Check clauses for context-dependent nodes
  */
@@ -4022,7 +4080,8 @@ simplify_function(Oid funcid, Oid result_type, int32 
result_typmod,
Assert(newexpr != (Expr *) );
}
 
-   if (!newexpr && allow_non_const)
+   if (!newexpr && allow_non_const &&
+   can_inline_function(func_tuple))
newexpr = inline_function(funcid, result_type, result_collid,
  input_collid, 
args, funcvariadic,
  func_tuple, 
context);
@@ -4415,16 +4474,11 @@ inline_function(Oid funcid, Oid result_type, Oid 
result_collid,
int i;
 
/*
-* Forget it if the function is not SQL-language or has other 
showstopper
-* properties.  (The prokind and nargs checks are just paranoia.)
+* Caller should already have checked whether the function can be 
inlined
+* using can_function_inline().
 */
-   if (funcform->prolang != SQLlanguageId ||
-   funcform->prokind != PROKIND_FUNCTION ||
-   

Re: plan cache overhead on plpgsql expression

2020-02-18 Thread Pavel Stehule
st 19. 2. 2020 v 7:30 odesílatel Pavel Stehule 
napsal:

>
>
> út 18. 2. 2020 v 17:08 odesílatel Amit Langote 
> napsal:
>
>> On Tue, Feb 18, 2020 at 6:56 PM Amit Langote 
>> wrote:
>> > On Tue, Feb 18, 2020 at 2:56 PM Pavel Stehule 
>> wrote:
>> > > út 18. 2. 2020 v 6:03 odesílatel Amit Langote <
>> amitlangot...@gmail.com> napsal:
>> > >> I didn't send the patch, because it didn't handle the cases where a
>> > >> simple expression consists of an inline-able function(s) in it, which
>> > >> are better handled by a full-fledged planner call backed up by the
>> > >> plan cache.  If we don't do that then every evaluation of such
>> > >> "simple" expression needs to invoke the planner.  For example:
>> > >>
>> > >> Consider this inline-able SQL function:
>> > >>
>> > >> create or replace function sql_incr(a bigint)
>> > >> returns int
>> > >> immutable language sql as $$
>> > >> select a+1;
>> > >> $$;
>> > >>
>> > >> Then this revised body of your function foo():
>> > >>
>> > >> CREATE OR REPLACE FUNCTION public.foo()
>> > >>  RETURNS int
>> > >>  LANGUAGE plpgsql
>> > >>  IMMUTABLE
>> > >> AS $function$
>> > >> declare i bigint = 0;
>> > >> begin
>> > >>   while i < 100
>> > >>   loop
>> > >> i := sql_incr(i);
>> > >>   end loop; return i;
>> > >> end;
>> > >> $function$
>> > >> ;
>> > >>
>> > >> With HEAD `select foo()` finishes in 786 ms, whereas with the patch,
>> > >> it takes 5102 ms.
>> > >>
>> > >> I think the patch might be good idea to reduce the time to compute
>> > >> simple expressions in plpgsql, if we can address the above issue.
>> > >
>> > >
>> > > Your patch is very interesting - minimally it returns performance
>> before 8.2. The mentioned issue can be fixed if we disallow SQL functions
>> in this fast execution.
>> >
>> > I updated the patch to do that.
>> >
>> > With the new patch, `select foo()`, with inline-able sql_incr() in it,
>> > runs in 679 ms.
>> >
>> > Without any inline-able function, it runs in 330 ms, whereas with
>> > HEAD, it takes 590 ms.
>>
>> I polished it a bit.
>>
>
> the performance looks very interesting - on my comp the execution time of
> 1 iterations was decreased from 34 sec to 15 sec,
>
> So it is interesting speedup
>

but regress tests fails



> Pavel
>
>
>
>> Thanks,
>> Amit
>>
>


regression.out
Description: Binary data


regression.diffs
Description: Binary data


Re: plan cache overhead on plpgsql expression

2020-02-18 Thread Pavel Stehule
út 18. 2. 2020 v 17:08 odesílatel Amit Langote 
napsal:

> On Tue, Feb 18, 2020 at 6:56 PM Amit Langote 
> wrote:
> > On Tue, Feb 18, 2020 at 2:56 PM Pavel Stehule 
> wrote:
> > > út 18. 2. 2020 v 6:03 odesílatel Amit Langote 
> napsal:
> > >> I didn't send the patch, because it didn't handle the cases where a
> > >> simple expression consists of an inline-able function(s) in it, which
> > >> are better handled by a full-fledged planner call backed up by the
> > >> plan cache.  If we don't do that then every evaluation of such
> > >> "simple" expression needs to invoke the planner.  For example:
> > >>
> > >> Consider this inline-able SQL function:
> > >>
> > >> create or replace function sql_incr(a bigint)
> > >> returns int
> > >> immutable language sql as $$
> > >> select a+1;
> > >> $$;
> > >>
> > >> Then this revised body of your function foo():
> > >>
> > >> CREATE OR REPLACE FUNCTION public.foo()
> > >>  RETURNS int
> > >>  LANGUAGE plpgsql
> > >>  IMMUTABLE
> > >> AS $function$
> > >> declare i bigint = 0;
> > >> begin
> > >>   while i < 100
> > >>   loop
> > >> i := sql_incr(i);
> > >>   end loop; return i;
> > >> end;
> > >> $function$
> > >> ;
> > >>
> > >> With HEAD `select foo()` finishes in 786 ms, whereas with the patch,
> > >> it takes 5102 ms.
> > >>
> > >> I think the patch might be good idea to reduce the time to compute
> > >> simple expressions in plpgsql, if we can address the above issue.
> > >
> > >
> > > Your patch is very interesting - minimally it returns performance
> before 8.2. The mentioned issue can be fixed if we disallow SQL functions
> in this fast execution.
> >
> > I updated the patch to do that.
> >
> > With the new patch, `select foo()`, with inline-able sql_incr() in it,
> > runs in 679 ms.
> >
> > Without any inline-able function, it runs in 330 ms, whereas with
> > HEAD, it takes 590 ms.
>
> I polished it a bit.
>

the performance looks very interesting - on my comp the execution time of
1 iterations was decreased from 34 sec to 15 sec,

So it is interesting speedup

Pavel



> Thanks,
> Amit
>


Re: plan cache overhead on plpgsql expression

2020-02-18 Thread Amit Langote
On Tue, Feb 18, 2020 at 6:56 PM Amit Langote  wrote:
> On Tue, Feb 18, 2020 at 2:56 PM Pavel Stehule  wrote:
> > út 18. 2. 2020 v 6:03 odesílatel Amit Langote  
> > napsal:
> >> I didn't send the patch, because it didn't handle the cases where a
> >> simple expression consists of an inline-able function(s) in it, which
> >> are better handled by a full-fledged planner call backed up by the
> >> plan cache.  If we don't do that then every evaluation of such
> >> "simple" expression needs to invoke the planner.  For example:
> >>
> >> Consider this inline-able SQL function:
> >>
> >> create or replace function sql_incr(a bigint)
> >> returns int
> >> immutable language sql as $$
> >> select a+1;
> >> $$;
> >>
> >> Then this revised body of your function foo():
> >>
> >> CREATE OR REPLACE FUNCTION public.foo()
> >>  RETURNS int
> >>  LANGUAGE plpgsql
> >>  IMMUTABLE
> >> AS $function$
> >> declare i bigint = 0;
> >> begin
> >>   while i < 100
> >>   loop
> >> i := sql_incr(i);
> >>   end loop; return i;
> >> end;
> >> $function$
> >> ;
> >>
> >> With HEAD `select foo()` finishes in 786 ms, whereas with the patch,
> >> it takes 5102 ms.
> >>
> >> I think the patch might be good idea to reduce the time to compute
> >> simple expressions in plpgsql, if we can address the above issue.
> >
> >
> > Your patch is very interesting - minimally it returns performance before 
> > 8.2. The mentioned issue can be fixed if we disallow SQL functions in this 
> > fast execution.
>
> I updated the patch to do that.
>
> With the new patch, `select foo()`, with inline-able sql_incr() in it,
> runs in 679 ms.
>
> Without any inline-able function, it runs in 330 ms, whereas with
> HEAD, it takes 590 ms.

I polished it a bit.

Thanks,
Amit


plpgsql-simple-exprs_v3.patch
Description: Binary data


Re: plan cache overhead on plpgsql expression

2020-02-18 Thread Amit Langote
On Tue, Feb 18, 2020 at 2:56 PM Pavel Stehule  wrote:
> út 18. 2. 2020 v 6:03 odesílatel Amit Langote  
> napsal:
>> I didn't send the patch, because it didn't handle the cases where a
>> simple expression consists of an inline-able function(s) in it, which
>> are better handled by a full-fledged planner call backed up by the
>> plan cache.  If we don't do that then every evaluation of such
>> "simple" expression needs to invoke the planner.  For example:
>>
>> Consider this inline-able SQL function:
>>
>> create or replace function sql_incr(a bigint)
>> returns int
>> immutable language sql as $$
>> select a+1;
>> $$;
>>
>> Then this revised body of your function foo():
>>
>> CREATE OR REPLACE FUNCTION public.foo()
>>  RETURNS int
>>  LANGUAGE plpgsql
>>  IMMUTABLE
>> AS $function$
>> declare i bigint = 0;
>> begin
>>   while i < 100
>>   loop
>> i := sql_incr(i);
>>   end loop; return i;
>> end;
>> $function$
>> ;
>>
>> With HEAD `select foo()` finishes in 786 ms, whereas with the patch,
>> it takes 5102 ms.
>>
>> I think the patch might be good idea to reduce the time to compute
>> simple expressions in plpgsql, if we can address the above issue.
>
>
> Your patch is very interesting - minimally it returns performance before 8.2. 
> The mentioned issue can be fixed if we disallow SQL functions in this fast 
> execution.

I updated the patch to do that.

With the new patch, `select foo()`, with inline-able sql_incr() in it,
runs in 679 ms.

Without any inline-able function, it runs in 330 ms, whereas with
HEAD, it takes 590 ms.

Thanks,
Amit
diff --git a/src/backend/optimizer/util/clauses.c 
b/src/backend/optimizer/util/clauses.c
index 2d3ec22407..5ce0079a12 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -108,6 +108,7 @@ static bool 
contain_volatile_functions_not_nextval_walker(Node *node, void *cont
 static bool max_parallel_hazard_walker(Node *node,
   
max_parallel_hazard_context *context);
 static bool contain_nonstrict_functions_walker(Node *node, void *context);
+static bool contain_inlinable_functions_walker(Node *node, void *context);
 static bool contain_context_dependent_node(Node *clause);
 static bool contain_context_dependent_node_walker(Node *node, int *flags);
 static bool contain_leaked_vars_walker(Node *node, void *context);
@@ -1218,6 +1219,63 @@ contain_nonstrict_functions_walker(Node *node, void 
*context)
  context);
 }
 
+/*
+ * Check clauses for inline-able functions
+ */
+
+bool
+contain_inlinable_functions(Node *node)
+{
+   return contain_inlinable_functions_walker(node, NULL);
+}
+
+/*
+ * can_inline_function - checks if a function is inline-able
+ */
+static bool
+can_inline_function(HeapTuple func_tuple)
+{
+   Form_pg_proc funcform = (Form_pg_proc) GETSTRUCT(func_tuple);
+
+   /*
+* Nope if the function is not SQL-language or has other showstopper
+* properties.  (The prokind and nargs checks are just paranoia.)
+*/
+   return  funcform->prolang == SQLlanguageId &&
+   funcform->prokind == PROKIND_FUNCTION &&
+   !funcform->prosecdef && !funcform->proretset &&
+   funcform->prorettype != RECORDOID &&
+   heap_attisnull(func_tuple, Anum_pg_proc_proconfig, 
NULL);
+}
+
+static bool
+can_inline_function_checker(Oid funcid, void *context)
+{
+   HeapTuple   func_tuple;
+   boolresult;
+
+   func_tuple = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
+   if (!HeapTupleIsValid(func_tuple))
+   elog(ERROR, "cache lookup failed for function %u", funcid);
+
+   result = can_inline_function(func_tuple);
+   ReleaseSysCache(func_tuple);
+
+   return result;
+}
+
+static bool
+contain_inlinable_functions_walker(Node *node, void *context)
+{
+   if (node == NULL)
+   return false;
+   if (check_functions_in_node(node, can_inline_function_checker, context))
+   return true;
+
+   return expression_tree_walker(node, contain_inlinable_functions_walker,
+ context);
+}
+
 /*
  * Check clauses for context-dependent nodes
  */
@@ -4022,7 +4080,8 @@ simplify_function(Oid funcid, Oid result_type, int32 
result_typmod,
Assert(newexpr != (Expr *) );
}
 
-   if (!newexpr && allow_non_const)
+   if (!newexpr && allow_non_const &&
+   can_inline_function(func_tuple))

Re: plan cache overhead on plpgsql expression

2020-02-17 Thread Pavel Stehule
út 18. 2. 2020 v 6:03 odesílatel Amit Langote 
napsal:

> Hi,
>
> On Sun, Feb 16, 2020 at 11:13 PM Pavel Stehule 
> wrote:
> > when I do some profiling of plpgsql, usually I surprised how significant
> overhead has expression execution. Any calculations are very slow.
> >
> > This is not typical example of plpgsql, but it shows cleanly where is a
> overhead
> >
> > CREATE OR REPLACE FUNCTION public.foo()
> >  RETURNS void
> >  LANGUAGE plpgsql
> >  IMMUTABLE
> > AS $function$
> > declare i bigint = 0;
> > begin
> >   while i < 1
> >   loop
> > i := i + 1;
> >   end loop;
> > end;
> > $function$
> >
> > Profile of development  version
> >
> >   10,04%  plpgsql.so  [.] exec_eval_simple_expr
> >9,17%  postgres[.] AcquireExecutorLocks
> >7,01%  postgres[.] ExecInterpExpr
> >5,86%  postgres[.]
> OverrideSearchPathMatchesCurrent
> >4,71%  postgres[.] GetCachedPlan
> >4,14%  postgres[.] AcquirePlannerLocks
> >3,72%  postgres[.] RevalidateCachedQuery
> >3,56%  postgres[.] MemoryContextReset
> >3,43%  plpgsql.so  [.] plpgsql_param_eval_var
>
> I was thinking about this overhead many months back and had even
> written a patch to avoid going to the planner for "simple"
> expressions, which can be handled by the executor.  Here is what the
> performance looks like:
>
> HEAD:
>
> latency: 31979.393 ms
>
> 18.32%  postgres  postgres   [.] ExecInterpExpr
> 11.37%  postgres  plpgsql.so [.] exec_eval_expr
>  8.58%  postgres  plpgsql.so [.] plpgsql_param_eval_var
>  8.31%  postgres  plpgsql.so [.] exec_stmt
>  6.44%  postgres  postgres   [.] GetCachedPlan
>  5.47%  postgres  postgres   [.] AcquireExecutorLocks
>  5.30%  postgres  postgres   [.] RevalidateCachedQuery
>  4.79%  postgres  plpgsql.so [.] exec_assign_value
>  4.41%  postgres  postgres   [.] SPI_plan_get_cached_plan
>  4.36%  postgres  postgres   [.] MemoryContextReset
>  4.22%  postgres  postgres   [.] ReleaseCachedPlan
>  4.03%  postgres  postgres   [.]
> OverrideSearchPathMatchesCurrent
>  2.63%  postgres  plpgsql.so [.] exec_assign_expr
>  2.11%  postgres  postgres   [.] int84lt
>  1.95%  postgres  postgres   [.]
> ResourceOwnerForgetPlanCacheRef
>  1.71%  postgres  postgres   [.] int84pl
>  1.57%  postgres  postgres   [.]
> ResourceOwnerRememberPlanCacheRef
>  1.38%  postgres  postgres   [.] recomputeNamespacePath
>  1.35%  postgres  postgres   [.] ScanQueryForLocks
>  1.24%  postgres  plpgsql.so [.] exec_cast_value
>  0.38%  postgres  postgres   [.]
> ResourceOwnerEnlargePlanCacheRefs
>  0.05%  postgres  [kernel.kallsyms]  [k] __do_softirq
>  0.03%  postgres  postgres   [.] GetUserId
>
> Patched:
>
> latency: 21011.871 ms
>
> 28.26%  postgres  postgres   [.] ExecInterpExpr
> 12.26%  postgres  plpgsql.so [.] plpgsql_param_eval_var
> 12.02%  postgres  plpgsql.so [.] exec_stmt
> 11.10%  postgres  plpgsql.so [.] exec_eval_expr
> 10.05%  postgres  postgres   [.] SPI_plan_is_valid
>  7.09%  postgres  postgres   [.] MemoryContextReset
>  6.65%  postgres  plpgsql.so [.] exec_assign_value
>  3.53%  postgres  plpgsql.so [.] exec_assign_expr
>  2.91%  postgres  postgres   [.] int84lt
>  2.61%  postgres  postgres   [.] int84pl
>  2.42%  postgres  plpgsql.so [.] exec_cast_value
>  0.86%  postgres  postgres   [.] CachedPlanIsValid
>  0.16%  postgres  plpgsql.so [.] SPI_plan_is_valid@plt
>  0.05%  postgres  [kernel.kallsyms]  [k] __do_softirq
>  0.03%  postgres  [kernel.kallsyms]  [k] finish_task_switch
>
> I didn't send the patch, because it didn't handle the cases where a
> simple expression consists of an inline-able function(s) in it, which
> are better handled by a full-fledged planner call backed up by the
> plan cache.  If we don't do that then every evaluation of such
> "simple" expression needs to invoke the planner.  For example:
>
> Consider this inline-able SQL function:
>
> create or replace function sql_incr(a bigint)
> returns int
> immutable language sql as $$
> select a+1;
> $$;
>
> Then this revised body of your function foo():
>
> CREATE OR REPLACE FUNCTION public.foo()
>  RETURNS int
>  LANGUAGE plpgsql
>  IMMUTABLE
> AS $function$
> declare i bigint = 0;
> begin
>   while i < 100
>   loop
> i := sql_incr(i);
>   end loop; return i;
> end;
> $function$
> ;
>
> With HEAD `select foo()` finishes in 786 ms, whereas with the patch,
> it 

Re: plan cache overhead on plpgsql expression

2020-02-17 Thread Amit Langote
Hi,

On Sun, Feb 16, 2020 at 11:13 PM Pavel Stehule  wrote:
> when I do some profiling of plpgsql, usually I surprised how significant 
> overhead has expression execution. Any calculations are very slow.
>
> This is not typical example of plpgsql, but it shows cleanly where is a 
> overhead
>
> CREATE OR REPLACE FUNCTION public.foo()
>  RETURNS void
>  LANGUAGE plpgsql
>  IMMUTABLE
> AS $function$
> declare i bigint = 0;
> begin
>   while i < 1
>   loop
> i := i + 1;
>   end loop;
> end;
> $function$
>
> Profile of development  version
>
>   10,04%  plpgsql.so  [.] exec_eval_simple_expr
>9,17%  postgres[.] AcquireExecutorLocks
>7,01%  postgres[.] ExecInterpExpr
>5,86%  postgres[.] 
> OverrideSearchPathMatchesCurrent
>4,71%  postgres[.] GetCachedPlan
>4,14%  postgres[.] AcquirePlannerLocks
>3,72%  postgres[.] RevalidateCachedQuery
>3,56%  postgres[.] MemoryContextReset
>3,43%  plpgsql.so  [.] plpgsql_param_eval_var

I was thinking about this overhead many months back and had even
written a patch to avoid going to the planner for "simple"
expressions, which can be handled by the executor.  Here is what the
performance looks like:

HEAD:

latency: 31979.393 ms

18.32%  postgres  postgres   [.] ExecInterpExpr
11.37%  postgres  plpgsql.so [.] exec_eval_expr
 8.58%  postgres  plpgsql.so [.] plpgsql_param_eval_var
 8.31%  postgres  plpgsql.so [.] exec_stmt
 6.44%  postgres  postgres   [.] GetCachedPlan
 5.47%  postgres  postgres   [.] AcquireExecutorLocks
 5.30%  postgres  postgres   [.] RevalidateCachedQuery
 4.79%  postgres  plpgsql.so [.] exec_assign_value
 4.41%  postgres  postgres   [.] SPI_plan_get_cached_plan
 4.36%  postgres  postgres   [.] MemoryContextReset
 4.22%  postgres  postgres   [.] ReleaseCachedPlan
 4.03%  postgres  postgres   [.] OverrideSearchPathMatchesCurrent
 2.63%  postgres  plpgsql.so [.] exec_assign_expr
 2.11%  postgres  postgres   [.] int84lt
 1.95%  postgres  postgres   [.] ResourceOwnerForgetPlanCacheRef
 1.71%  postgres  postgres   [.] int84pl
 1.57%  postgres  postgres   [.] ResourceOwnerRememberPlanCacheRef
 1.38%  postgres  postgres   [.] recomputeNamespacePath
 1.35%  postgres  postgres   [.] ScanQueryForLocks
 1.24%  postgres  plpgsql.so [.] exec_cast_value
 0.38%  postgres  postgres   [.] ResourceOwnerEnlargePlanCacheRefs
 0.05%  postgres  [kernel.kallsyms]  [k] __do_softirq
 0.03%  postgres  postgres   [.] GetUserId

Patched:

latency: 21011.871 ms

28.26%  postgres  postgres   [.] ExecInterpExpr
12.26%  postgres  plpgsql.so [.] plpgsql_param_eval_var
12.02%  postgres  plpgsql.so [.] exec_stmt
11.10%  postgres  plpgsql.so [.] exec_eval_expr
10.05%  postgres  postgres   [.] SPI_plan_is_valid
 7.09%  postgres  postgres   [.] MemoryContextReset
 6.65%  postgres  plpgsql.so [.] exec_assign_value
 3.53%  postgres  plpgsql.so [.] exec_assign_expr
 2.91%  postgres  postgres   [.] int84lt
 2.61%  postgres  postgres   [.] int84pl
 2.42%  postgres  plpgsql.so [.] exec_cast_value
 0.86%  postgres  postgres   [.] CachedPlanIsValid
 0.16%  postgres  plpgsql.so [.] SPI_plan_is_valid@plt
 0.05%  postgres  [kernel.kallsyms]  [k] __do_softirq
 0.03%  postgres  [kernel.kallsyms]  [k] finish_task_switch

I didn't send the patch, because it didn't handle the cases where a
simple expression consists of an inline-able function(s) in it, which
are better handled by a full-fledged planner call backed up by the
plan cache.  If we don't do that then every evaluation of such
"simple" expression needs to invoke the planner.  For example:

Consider this inline-able SQL function:

create or replace function sql_incr(a bigint)
returns int
immutable language sql as $$
select a+1;
$$;

Then this revised body of your function foo():

CREATE OR REPLACE FUNCTION public.foo()
 RETURNS int
 LANGUAGE plpgsql
 IMMUTABLE
AS $function$
declare i bigint = 0;
begin
  while i < 100
  loop
i := sql_incr(i);
  end loop; return i;
end;
$function$
;

With HEAD `select foo()` finishes in 786 ms, whereas with the patch,
it takes 5102 ms.

I think the patch might be good idea to reduce the time to compute
simple expressions in plpgsql, if we can address the above issue.

Thanks,
Amit
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index b7c8d663fc..b292948853 100644
--- 

Re: plan cache overhead on plpgsql expression

2020-02-16 Thread Pavel Stehule
ne 16. 2. 2020 v 15:12 odesílatel Pavel Stehule 
napsal:

> Hi
>
> when I do some profiling of plpgsql, usually I surprised how significant
> overhead has expression execution. Any calculations are very slow.
>
> This is not typical example of plpgsql, but it shows cleanly where is a
> overhead
>
> CREATE OR REPLACE FUNCTION public.foo()
>  RETURNS void
>  LANGUAGE plpgsql
>  IMMUTABLE
> AS $function$
> declare i bigint = 0;
> begin
>   while i < 1
>   loop
> i := i + 1;
>   end loop;
> end;
> $function$
>
>
> Is interesting so overhead of plan cache about 15%
>
> The execution needs 32 sec on Postgres13 and 27sec on Postgres8.2
>

On same computer same example in Perl needs only 7 sec.

Regards

Pavel


> Regards
>
> Pavel
>
>