Re: WIP Patch: Precalculate stable functions, infrastructure v1

2022-05-23 Thread David Geier
Hi hackers!

I would like to revive this thread. At ServiceNow we recurringly encounter
queries that are much slower than they would have to be, because of
frequent calls to uncached stable functions with constant arguments (mostly
to_date()). We've seen e.g. queries that get more than 8x faster by
temporarily changing to_date() from stable to immutable.

I would be glad to help bringing this effort forward. Was there more work
on the patch left than rebasing on latest master?
@Marina: do you have any plans to continue with this?

For reference here are all existing mailing list discussions I could find
on this topic:

- [WIP] Caching constant stable expressions per execution (Marti, 2011),
https://www.postgresql.org/message-id/flat/CABRT9RC-1wGxZC_Z5mwkdk70fgY2DRX3sLXzdP4voBKuKPZDow%40mail.gmail.com
- Caching for stable expressions with constant arguments v6 (Marti, 2012),
https://www.postgresql.org/message-id/flat/CABRT9RA-RomVS-yzQ2wUtZ=m-ev61lcbrl1p1j3jydpsttf...@mail.gmail.com
- WIP Patch: Precalculate stable functions (Marina, 2017),
https://www.postgresql.org/message-id/flat/ba261b9fc25dea4069d8ba9a8fcad...@postgrespro.ru
- WIP Patch: Precalculate stable functions, infrastructure v1 (Marina,
2017),
https://www.postgresql.org/message-id/flat/da87bb6a014e029176a04f6e50033cfb%40postgrespro.ru

--
David Geier
(ServiceNow)

On Mon, 23 May 2022 at 17:06, Andres Freund  wrote:

> On 2018-11-29 18:00:15 +0100, Dmitry Dolgov wrote:
> > > On Tue, Oct 2, 2018 at 4:22 AM Michael Paquier 
> wrote:
> > >
> > > On Thu, May 24, 2018 at 04:00:33PM +0300, Marina Polyakova wrote:
> > > > Here there's a 9-th version of the patches for the precalculation of
> stable
> > > > or immutable functions, stable or immutable operators and other
> nonvolatile
> > > > expressions. This is a try to execute cached expressions as
> PARAM_EXEC,
> > > > thanks to the comments of Tom Lane and Andres Freund [1].
> > >
> > > Please note that v9-0004 fails to apply, so a rebase is needed.  This
> > > patch is moved to next CF, waiting on author.
> >
> > Unfortunately, patch still has some conflicts, could you please post an
> updated
> > version?
>
> As nothing has happened since, I'm marking this as returned with
> feedback.
>
> Greetings,
>
> Andres Freund
>
>
>
>


Re: WIP Patch: Precalculate stable functions, infrastructure v1

2019-01-31 Thread Andres Freund
On 2018-11-29 18:00:15 +0100, Dmitry Dolgov wrote:
> > On Tue, Oct 2, 2018 at 4:22 AM Michael Paquier  wrote:
> >
> > On Thu, May 24, 2018 at 04:00:33PM +0300, Marina Polyakova wrote:
> > > Here there's a 9-th version of the patches for the precalculation of 
> > > stable
> > > or immutable functions, stable or immutable operators and other 
> > > nonvolatile
> > > expressions. This is a try to execute cached expressions as PARAM_EXEC,
> > > thanks to the comments of Tom Lane and Andres Freund [1].
> >
> > Please note that v9-0004 fails to apply, so a rebase is needed.  This
> > patch is moved to next CF, waiting on author.
> 
> Unfortunately, patch still has some conflicts, could you please post an 
> updated
> version?

As nothing has happened since, I'm marking this as returned with
feedback.

Greetings,

Andres Freund



Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-11-29 Thread Dmitry Dolgov
> On Tue, Oct 2, 2018 at 4:22 AM Michael Paquier  wrote:
>
> On Thu, May 24, 2018 at 04:00:33PM +0300, Marina Polyakova wrote:
> > Here there's a 9-th version of the patches for the precalculation of stable
> > or immutable functions, stable or immutable operators and other nonvolatile
> > expressions. This is a try to execute cached expressions as PARAM_EXEC,
> > thanks to the comments of Tom Lane and Andres Freund [1].
>
> Please note that v9-0004 fails to apply, so a rebase is needed.  This
> patch is moved to next CF, waiting on author.

Unfortunately, patch still has some conflicts, could you please post an updated
version?



Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-10-01 Thread Michael Paquier
Hi Marina,

On Thu, May 24, 2018 at 04:00:33PM +0300, Marina Polyakova wrote:
> Here there's a 9-th version of the patches for the precalculation of stable
> or immutable functions, stable or immutable operators and other nonvolatile
> expressions. This is a try to execute cached expressions as PARAM_EXEC,
> thanks to the comments of Tom Lane and Andres Freund [1].

Please note that v9-0004 fails to apply, so a rebase is needed.  This
patch is moved to next CF, waiting on author.
--
Michael


signature.asc
Description: PGP signature


Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-03-02 Thread Marina Polyakova

Ok!

On 02-03-2018 22:56, Andres Freund wrote:

Hi,

On 2018-03-02 11:22:01 +0300, Marina Polyakova wrote:
I fixed the failure that Thomas pointed out to me, and I'm finishing 
work on

it, but it took me a while to study this part of the executor..


I unfortunately think that makes this too late for v11, and we should
mark this as returned with feedback.

Greetings,

Andres Freund


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-03-02 Thread Andres Freund
Hi,

On 2018-03-02 11:22:01 +0300, Marina Polyakova wrote:
> I fixed the failure that Thomas pointed out to me, and I'm finishing work on
> it, but it took me a while to study this part of the executor..

I unfortunately think that makes this too late for v11, and we should
mark this as returned with feedback.

Greetings,

Andres Freund



Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-03-02 Thread Marina Polyakova

Hello!

I fixed the failure that Thomas pointed out to me, and I'm finishing 
work on it, but it took me a while to study this part of the executor..


On 02-03-2018 0:11, Andres Freund wrote:

Hi,

On 2018-02-01 08:01:48 +0300, Marina Polyakova wrote:

> ISTM, there might be some value to consider all of them in the design of
> the new mechanism.

I'm sorry, the other parts have occupied all the time, and I'll work 
on it..


That has, as far as I can see, not happened. And the patch has been
reported as failing by Thomas a while ago.  So I'm inclined to mark 
this

as returned with feedback for this CF?

- Andres


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-03-01 Thread Andres Freund
Hi,

On 2018-02-01 08:01:48 +0300, Marina Polyakova wrote:
> > ISTM, there might be some value to consider all of them in the design of
> > the new mechanism.
>
> I'm sorry, the other parts have occupied all the time, and I'll work on it..

That has, as far as I can see, not happened. And the patch has been
reported as failing by Thomas a while ago.  So I'm inclined to mark this
as returned with feedback for this CF?

- Andres



Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-02-05 Thread Marina Polyakova

Hello!

Thank you for reporting! I'll try to get it on our buildfarm..

On 05-02-2018 0:10, Thomas Munro wrote:

On Thu, Feb 1, 2018 at 6:01 PM, Marina Polyakova
 wrote:
This is the 8-th version of the patch for the precalculation of stable 
or
immutable functions, stable or immutable operators and other 
nonvolatile
expressions. This is a try to fix the most problems (I'm sorry, it 
took some
time..) that Tom Lane and Andres Freund mentioned in [1], [2] and [3]. 
It is
based on the top of master, and on my computer make check-world 
passes. And

I'll continue work on it.


Hi Marina,

FYI I saw a repeatable crash in the contrib regression tests when
running make check-world with this patch applied.

test hstore_plperl... FAILED (test process exited with exit 
code 2)
test hstore_plperlu   ... FAILED (test process exited with exit 
code 2)

test create_transform ... FAILED

I'm not sure why it passes for you but fails here, but we can see from
the backtrace[1] that ExecInitExprRec is receiving a null node pointer
on this Ubuntu Trusty GCC 4.8 amd64 system.

[1] https://travis-ci.org/postgresql-cfbot/postgresql/builds/337255374


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-02-04 Thread Thomas Munro
On Thu, Feb 1, 2018 at 6:01 PM, Marina Polyakova
 wrote:
> This is the 8-th version of the patch for the precalculation of stable or
> immutable functions, stable or immutable operators and other nonvolatile
> expressions. This is a try to fix the most problems (I'm sorry, it took some
> time..) that Tom Lane and Andres Freund mentioned in [1], [2] and [3]. It is
> based on the top of master, and on my computer make check-world passes. And
> I'll continue work on it.

Hi Marina,

FYI I saw a repeatable crash in the contrib regression tests when
running make check-world with this patch applied.

test hstore_plperl... FAILED (test process exited with exit code 2)
test hstore_plperlu   ... FAILED (test process exited with exit code 2)
test create_transform ... FAILED

I'm not sure why it passes for you but fails here, but we can see from
the backtrace[1] that ExecInitExprRec is receiving a null node pointer
on this Ubuntu Trusty GCC 4.8 amd64 system.

[1] https://travis-ci.org/postgresql-cfbot/postgresql/builds/337255374

-- 
Thomas Munro
http://www.enterprisedb.com



Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-01-25 Thread Marina Polyakova

Hello!

On 24-01-2018 23:36, Andres Freund wrote:

On 2018-01-24 15:10:56 -0500, Tom Lane wrote:
...

Keeping the stored value of a CachedExpr in a Param slot is an
interesting idea indeed.


We keep coming back to this, IIRC we had a pretty similar discussion
around redesigning caseValue_datum/isNull domainValue_datum/isNull to 
be

less ugly. There also was
https://www.postgresql.org/message-id/20171116182208.kcvf75nfaldv3...@alap3.anarazel.de
where we discussed using something similar to PARAM_EXEC Param nodes to
allow inlining of volatile functions.

ISTM, there might be some value to consider all of them in the design 
of

the new mechanism.


Thank you both very much for this discussion and for the link on that 
thread! Now I'm working on the patch, thanks to Tom Lane's comments 
earlier [1], and I'll try to implement something of this..


[1] 
https://www.postgresql.org/message-id/403e0ae329c6868b3f3467eac92cc04d%40postgrespro.ru


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-01-24 Thread Andres Freund
On 2018-01-24 15:10:56 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2018-01-16 17:05:01 -0500, Tom Lane wrote:
> >> I'm curious to know whether Andres has some other ideas, or whether he
> >> feels this is all a big wart on the compiled-expression concept.
>
> > I don't have too many "artistic" concerns from the compiled expression
> > POV. The biggest issue I see is that it'll make it a bit harder to
> > separate out the expression compilation phase from the expression
> > instantiation phase - something I think we definitely want.
>
> Hmm, there's no such distinction now, so could you explain what you
> have in mind there?

There's a few related concerns I have:
- For OLTP workloads using prepared statements, we often spend the
  majority of the time doing ExecInitExpr() and related tasks like
  computing tupledescs.
- For OLTP workloads the low allocation density for things hanging off
  ExprState's and PlanState nodes is a significant concern. The number
  of allocations cause overhead, the overhead wastes memory and lowers
  cache hit ratios.
- For JIT we currently end up encoding specific pointer values into the
  generated code. As these obviously prevent reuse of the generated
  function, this noticeably reduces the applicability of JITing to fewer
  usecases. JITing is actually quite beneficial for a lot of OLTP
  workloads too, but it's too expensive to do every query.

To address these, I think we may want to split the the division of labor
a bit. Expression instantiation (i.e. ExecReadyExpr()) should happen at
executor startup, but in a lot of cases "compiling" the steps itself
should happen at plan time. Obviously that means the steps themselves
can't contain plain pointers, as the per-execution memory will be
located in different places.  So I think what we should have is that
expression initialization just computes the size of required memory for
all steps and puts *offsets* into that in the steps. After that
expression instantiation either leaves them alone and evaluation uses
relative pointers (cheap-ish e.g. on x86 due to lea), or just turn the
relative pointers into absolute ones.
That means that all the memory for all steps of an ExprState would be
allocated in one chunk, reducing allocation overhead and increasing
cache hit ratios considerably.

I've experimented a bit with a rough rough hack of the above (purely at
execution time), and it doesn't seem too hard.


> Keeping the stored value of a CachedExpr in a Param slot is an
> interesting idea indeed.

We keep coming back to this, IIRC we had a pretty similar discussion
around redesigning caseValue_datum/isNull domainValue_datum/isNull to be
less ugly. There also was
https://www.postgresql.org/message-id/20171116182208.kcvf75nfaldv3...@alap3.anarazel.de
where we discussed using something similar to PARAM_EXEC Param nodes to
allow inlining of volatile functions.

ISTM, there might be some value to consider all of them in the design of
the new mechanism.

Greetings,

Andres Freund



Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-01-24 Thread Tom Lane
Andres Freund  writes:
> On 2018-01-16 17:05:01 -0500, Tom Lane wrote:
>> I'm curious to know whether Andres has some other ideas, or whether he
>> feels this is all a big wart on the compiled-expression concept.

> I don't have too many "artistic" concerns from the compiled expression
> POV. The biggest issue I see is that it'll make it a bit harder to
> separate out the expression compilation phase from the expression
> instantiation phase - something I think we definitely want.

Hmm, there's no such distinction now, so could you explain what you
have in mind there?

>> I don't think there are any existing cases where we keep any
>> meaningful state across executions of a compiled-expression data
>> structure; maybe that's a bad idea in itself.

> Storing all cached values in an EState or ExprContext (the
> latter referring to the former) somewhat alike the values for Param's
> sounds a lot more reasonable to me.
> ...
> This all reminds me a lot of the infrastructure for Params...

Yeah, one thing I was thinking about in connection with this is the
stuff associated with propagating changes in outer-reference Params
(the extParam/allParam/chgParam mess).  I wonder if we could find
a way to unify that with this feature.

Keeping the stored value of a CachedExpr in a Param slot is an
interesting idea indeed.

regards, tom lane



Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-01-24 Thread Andres Freund
Hi,


On 2018-01-16 17:05:01 -0500, Tom Lane wrote:
> [ I'm sending this comment separately because I think it's an issue
> Andres might take an interest in. ]

Thanks for that. I indeed am interested. Sorry for the late response,
was very deep into the JIT patch.


> Marina Polyakova  writes:
> > [ v7-0001-Precalculate-stable-and-immutable-functions.patch ]
> 
> Another thing that's bothering me is that the execution semantics
> you're proposing for CachedExpr seem rather inflexible.  AFAICS, once a
> CachedExpr has run once, it will hang on to the result value and keep
> returning that for the entire lifespan of the compiled expression.
> We already noted that that breaks plpgsql's "simple expression"
> logic, and it seems inevitable to me that it will be an issue for
> other places as well.  I think it'd be a better design if we had some
> provision for resetting the cached values, short of recompiling the
> expression from scratch.

Hm. Yes, that makes me uncomfortable as well.


> One way that occurs to me to do this is to replace the simple boolean
> isExecuted flags with a generation counter, and add a master generation
> counter to ExprState.  The rule for executing CachedExpr would be "if my
> generation counter is different from the ExprState's counter, then
> evaluate the subexpression and copy the ExprState's counter into mine".
> Then the procedure for forcing recalculation of cached values is just to
> increment the ExprState's counter.  There are other ways one could imagine
> doing this --- for instance, I initially thought of keeping the master
> counter in the ExprContext being used to run the expression.  But you need
> some way to remember what counter value was used last with a particular
> expression, so probably keeping it in ExprState is better.

I'm not a big fan of this solution. We seem to be inventing more and
more places we keep state, rather than the contrary.


> Or we could just brute-force it by providing a function that runs through
> a compiled expression step list and resets the isExecuted flag for each
> EEOP_CACHEDEXPR_IF_CACHED step it finds.  A slightly less brute-force
> way is to link those steps together in a list, so that the function
> doesn't have to visit irrelevant steps.  If the reset function were seldom
> used then the extra cycles for this wouldn't be very expensive.  But I'm
> not sure it will be seldom used --- it seems like plpgsql simple
> expressions will be doing this every time --- so I think the counter
> approach might be a better idea.

Hm, that sounds like it'd not be cheap.


> I'm curious to know whether Andres has some other ideas, or whether he
> feels this is all a big wart on the compiled-expression concept.

I don't have too many "artistic" concerns from the compiled expression
POV. The biggest issue I see is that it'll make it a bit harder to
separate out the expression compilation phase from the expression
instantiation phase - something I think we definitely want.


> I don't think there are any existing cases where we keep any
> meaningful state across executions of a compiled-expression data
> structure; maybe that's a bad idea in itself.

To me, who has *not* followed the thread in detail, it sounds like the
relevant data shouldn't be stored inside the expression itself.  For
one, we do not want to have to visit every single simple expression and
reset them, for another it architecturally doesn't seem the right place
to me.  Storing all cached values in an EState or ExprContext (the
latter referring to the former) somewhat alike the values for Param's
sounds a lot more reasonable to me.

Besides that it seems to make it a lot easier to reset the values, it
also seems like it makes it a lot cleaner to cache stable functions
across multiple expressions in different places in a query? ISTM having
expression steps to actually compute the expression value in every
referencing expression is quite the waste.

This all reminds me a lot of the infrastructure for Params...

Greetings,

Andres Freund



Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-01-20 Thread Marina Polyakova

On 17-01-2018 1:05, Tom Lane wrote:

[ I'm sending this comment separately because I think it's an issue
Andres might take an interest in. ]

Marina Polyakova  writes:

[ v7-0001-Precalculate-stable-and-immutable-functions.patch ]


Another thing that's bothering me is that the execution semantics
you're proposing for CachedExpr seem rather inflexible.  AFAICS, once a
CachedExpr has run once, it will hang on to the result value and keep
returning that for the entire lifespan of the compiled expression.
We already noted that that breaks plpgsql's "simple expression"
logic, and it seems inevitable to me that it will be an issue for
other places as well.  I think it'd be a better design if we had some
provision for resetting the cached values, short of recompiling the
expression from scratch.

One way that occurs to me to do this is to replace the simple boolean
isExecuted flags with a generation counter, and add a master generation
counter to ExprState.  The rule for executing CachedExpr would be "if 
my

generation counter is different from the ExprState's counter, then
evaluate the subexpression and copy the ExprState's counter into mine".
Then the procedure for forcing recalculation of cached values is just 
to
increment the ExprState's counter.  There are other ways one could 
imagine

doing this --- for instance, I initially thought of keeping the master
counter in the ExprContext being used to run the expression.  But you 
need

some way to remember what counter value was used last with a particular
expression, so probably keeping it in ExprState is better.

Or we could just brute-force it by providing a function that runs 
through

a compiled expression step list and resets the isExecuted flag for each
EEOP_CACHEDEXPR_IF_CACHED step it finds.  A slightly less brute-force
way is to link those steps together in a list, so that the function
doesn't have to visit irrelevant steps.  If the reset function were 
seldom
used then the extra cycles for this wouldn't be very expensive.  But 
I'm

not sure it will be seldom used --- it seems like plpgsql simple
expressions will be doing this every time --- so I think the counter
approach might be a better idea.


Thank you very much! I'll try to implement something from this.


I'm curious to know whether Andres has some other ideas, or whether he
feels this is all a big wart on the compiled-expression concept.  I 
don't

think there are any existing cases where we keep any meaningful state
across executions of a compiled-expression data structure; maybe that's
a bad idea in itself.


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-01-20 Thread Marina Polyakova

As I said, thank you so much for your comments!!

On 17-01-2018 0:30, Tom Lane wrote:

...
This is indeed quite a large patch, but it seems to me it could become
smaller. After a bit of review:

1. I do not like what you've done with ParamListInfo. The changes 
around

that are invasive, accounting for a noticeable part of the patch bulk,
and I don't think they're well designed. Having to cast back and forth
between ParamListInfo and ParamListInfoCommon and so on is ugly and 
prone

to cause (or hide) errors. And I don't really understand why
ParamListInfoPrecalculationData exists at all. Couldn't you have gotten
the same result with far fewer notational changes by defining another
PARAM_FLAG bit in the existing pflags field? (Or alternatively, maybe
the real need here is for another ParamKind value for Param nodes?)

I also dislike this approach because it effectively throws away the
support for "virtual" param arrays that I added in commit 6719b238e:
ParamListInfoPrecalculationData has no support for dynamically 
determined
parameter properties, which is surely something that somebody will 
need.

(It's just luck that the patch doesn't break plpgsql today.) I realize
that that's a recent commit and the code I'm complaining about predates
it, but we need to adjust this so that it fits in with the new 
approach.

See comment block at lines 25ff in params.h.


I'll try to use ParamListInfoData for generic plans (= to get cached 
expressions for params of prepared statements where possible) without 
changing its infrastructure.


2. I don't follow the need for the also-rather-invasive changes to 
domain

constraint data structures. I do see that the patch attempts to make
CoerceToDomain nodes cacheable, which is flat wrong and has to be 
ripped

out. You *cannot* assume that the planner has access to the same domain
constraints that will apply at runtime.


I'm sorry, I did not know about this :-[


I've occasionally thought that we should hook domain constraint changes
into the plan invalidation mechanism, which would make it possible for
the planner to assume that the constraints seen at planning time will
apply at execution. Whereupon we could have the planner insert domain
constraint expressions into the plan rather than leaving those to be
collected at query startup by execExpr.c, and then do things like
constant-folding and cacheing CoerceToDomain nodes. But that would be
a rather large and very domain-specific change, and so it would be fit
material for a different patch IMO. I recommend that for now you just
treat CoerceToDomain as an uncacheable expression type and rip all the
domain-related changes out of this patch.


I'll fix this.


3. I think you should also try hard to get rid of the need for
PlannedStmt.hasCachedExpr. AFAICS there's only one place that is
using that flag, which is exec_simple_check_plan, and I have to
think there are better ways we could deal with that. In particular,
I don't understand why you haven't simply set up plpgsql parameter
references to be noncacheable. Or maybe what we'd better do is
disable CacheExpr insertions into potentially-simple plans in the
first place. As you have it here, it's possible for recompilation
of an expression to result in a change in whether it should be deemed
simple or not, which will break things (cf commit 00418c612).


I'm sorry, I'll fix the use of parameters in this case. And I'll think 
how to get rid of the need for PlannedStmt.hasCachedExpr when there're 
possible cached expressions without parameters.



4. I don't like the way that you've inserted
"replace_qual_cached_expressions" and
"replace_pathtarget_cached_expressions" calls into seemingly random 
places

in the planner. Why isn't that being done uniformly during expression
preprocessing? There's no apparent structure to where you've put these
calls, and so they seem really vulnerable to errors of omission.


I'll fix this.


Also,
if this were done in expression preprocessing, there'd be a chance of
combining it with some existing pass over expression trees instead of
having to do a separate (and expensive) expression tree mutation.
I can't help suspecting that eval_const_expressions could take this on
as an additional responsibility with a lot less than a thousand new 
lines

of code.


From quick look I see no contradictions so I'll try to implement it.

5. BTW, cost_eval_cacheable_expr seems like useless restructuring as 
well.

Why aren't you just recursively applying the regular costing function?


Such a stupid mistake :( I'll fix this.


If you did all of the above it would result in a pretty significant
reduction of the number of places touched by the patch, which would 
make

it easier to see what's going on. Then we could start to discuss, for
instance, what does the "isConstParam" flag actually *mean* and why
is it different from PARAM_FLAG_CONST?


AFAIU they do not differ, and as I said above I'll try not to change the 
infrastructure of ParamListInfoData.




Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-01-17 Thread Marina Polyakova
Thank you so much for your comments!! I'll answer a bit later because 
now I'm trying to find a test for int128 on Solaris 10.. [1]


On 17-01-2018 1:05, Tom Lane wrote:

[ I'm sending this comment separately because I think it's an issue
Andres might take an interest in. ]

Marina Polyakova  writes:

[ v7-0001-Precalculate-stable-and-immutable-functions.patch ]


Another thing that's bothering me is that the execution semantics
you're proposing for CachedExpr seem rather inflexible.  AFAICS, once a
CachedExpr has run once, it will hang on to the result value and keep
returning that for the entire lifespan of the compiled expression.
We already noted that that breaks plpgsql's "simple expression"
logic, and it seems inevitable to me that it will be an issue for
other places as well.  I think it'd be a better design if we had some
provision for resetting the cached values, short of recompiling the
expression from scratch.

One way that occurs to me to do this is to replace the simple boolean
isExecuted flags with a generation counter, and add a master generation
counter to ExprState.  The rule for executing CachedExpr would be "if 
my

generation counter is different from the ExprState's counter, then
evaluate the subexpression and copy the ExprState's counter into mine".
Then the procedure for forcing recalculation of cached values is just 
to
increment the ExprState's counter.  There are other ways one could 
imagine

doing this --- for instance, I initially thought of keeping the master
counter in the ExprContext being used to run the expression.  But you 
need

some way to remember what counter value was used last with a particular
expression, so probably keeping it in ExprState is better.

Or we could just brute-force it by providing a function that runs 
through

a compiled expression step list and resets the isExecuted flag for each
EEOP_CACHEDEXPR_IF_CACHED step it finds.  A slightly less brute-force
way is to link those steps together in a list, so that the function
doesn't have to visit irrelevant steps.  If the reset function were 
seldom
used then the extra cycles for this wouldn't be very expensive.  But 
I'm

not sure it will be seldom used --- it seems like plpgsql simple
expressions will be doing this every time --- so I think the counter
approach might be a better idea.

I'm curious to know whether Andres has some other ideas, or whether he
feels this is all a big wart on the compiled-expression concept.  I 
don't

think there are any existing cases where we keep any meaningful state
across executions of a compiled-expression data structure; maybe that's
a bad idea in itself.

regards, tom lane


[1] 
https://www.postgresql.org/message-id/18209.1516059711%40sss.pgh.pa.us


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-01-16 Thread Tom Lane
Aleksander Alekseev  writes:
> I can confirm this code works. However, since this is quite a large patch, I 
> believe we better have a second reviewer or a very attentive committer. 
> The new status of this patch is: Ready for Committer

This is indeed quite a large patch, but it seems to me it could become
smaller.  After a bit of review:

1. I do not like what you've done with ParamListInfo.  The changes around
that are invasive, accounting for a noticeable part of the patch bulk,
and I don't think they're well designed.  Having to cast back and forth
between ParamListInfo and ParamListInfoCommon and so on is ugly and prone
to cause (or hide) errors.  And I don't really understand why
ParamListInfoPrecalculationData exists at all.  Couldn't you have gotten
the same result with far fewer notational changes by defining another
PARAM_FLAG bit in the existing pflags field?  (Or alternatively, maybe
the real need here is for another ParamKind value for Param nodes?)

I also dislike this approach because it effectively throws away the
support for "virtual" param arrays that I added in commit 6719b238e:
ParamListInfoPrecalculationData has no support for dynamically determined
parameter properties, which is surely something that somebody will need.
(It's just luck that the patch doesn't break plpgsql today.)  I realize
that that's a recent commit and the code I'm complaining about predates
it, but we need to adjust this so that it fits in with the new approach.
See comment block at lines 25ff in params.h.

2. I don't follow the need for the also-rather-invasive changes to domain
constraint data structures.  I do see that the patch attempts to make
CoerceToDomain nodes cacheable, which is flat wrong and has to be ripped
out.  You *cannot* assume that the planner has access to the same domain
constraints that will apply at runtime.

I've occasionally thought that we should hook domain constraint changes
into the plan invalidation mechanism, which would make it possible for
the planner to assume that the constraints seen at planning time will
apply at execution.  Whereupon we could have the planner insert domain
constraint expressions into the plan rather than leaving those to be
collected at query startup by execExpr.c, and then do things like
constant-folding and cacheing CoerceToDomain nodes.  But that would be
a rather large and very domain-specific change, and so it would be fit
material for a different patch IMO.  I recommend that for now you just
treat CoerceToDomain as an uncacheable expression type and rip all the
domain-related changes out of this patch.

3. I think you should also try hard to get rid of the need for
PlannedStmt.hasCachedExpr.  AFAICS there's only one place that is
using that flag, which is exec_simple_check_plan, and I have to
think there are better ways we could deal with that.  In particular,
I don't understand why you haven't simply set up plpgsql parameter
references to be noncacheable.  Or maybe what we'd better do is
disable CacheExpr insertions into potentially-simple plans in the
first place.  As you have it here, it's possible for recompilation
of an expression to result in a change in whether it should be deemed
simple or not, which will break things (cf commit 00418c612).

4. I don't like the way that you've inserted
"replace_qual_cached_expressions" and
"replace_pathtarget_cached_expressions" calls into seemingly random places
in the planner.  Why isn't that being done uniformly during expression
preprocessing?  There's no apparent structure to where you've put these
calls, and so they seem really vulnerable to errors of omission.  Also,
if this were done in expression preprocessing, there'd be a chance of
combining it with some existing pass over expression trees instead of
having to do a separate (and expensive) expression tree mutation.
I can't help suspecting that eval_const_expressions could take this on
as an additional responsibility with a lot less than a thousand new lines
of code.

5. BTW, cost_eval_cacheable_expr seems like useless restructuring as well.
Why aren't you just recursively applying the regular costing function?


If you did all of the above it would result in a pretty significant
reduction of the number of places touched by the patch, which would make
it easier to see what's going on.  Then we could start to discuss, for
instance, what does the "isConstParam" flag actually *mean* and why
is it different from PARAM_FLAG_CONST?  And what in the world is
CheckBoundParams about?  The internal documentation in this patch
isn't quite nonexistent, but it's well short of being in a
committable state IMO.

regards, tom lane



Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-01-15 Thread Aleksander Alekseev
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

I can confirm this code works. However, since this is quite a large patch, I 
believe we better have a second reviewer or a very attentive committer. 

The new status of this patch is: Ready for Committer