Re: Optimze usage of immutable functions as relation

2019-09-24 Thread Tom Lane
rmrodrig...@carto.com writes:
> This commit is breaking some Postgis tests with custom types.

Hm, yeah, the code fails to consider the possibility that the function
returns a composite type.

For the moment I'm just going to make it punt if the function result
class isn't TYPEFUNC_SCALAR.  In principle, if we have a composite
result, we could disassemble it into per-column constant values, but
I'm not sure it'd be worth the work.

regards, tom lane




Re: Optimze usage of immutable functions as relation

2019-09-24 Thread rmrodriguez
Hi,

This commit is breaking some Postgis tests with custom types.

Here is a minimal repro (Postgis not required)

```
-- test custom types
create type t_custom_type AS (
valid bool,
reason varchar,
location varchar
);

create or replace function f_immutable_custom_type(i integer)
returns t_custom_type as
$$ declare oCustom t_custom_type;
begin
select into oCustom true as valid, 'OK' as reason, NULL as location;
return oCustom;
end; $$ language plpgsql immutable;

select valid, reason, location from f_immutable_custom_type(3);

drop function f_immutable_custom_type;
drop type t_custom_type;
```

Expected (PG12):
```
valid | reason | location
---++--
 t | OK |
(1 row)
```

Instead with master/HEAD (eb57bd9c1d) we are getting:
```
ERROR:  could not find attribute 2 in subquery targetlist
```

Reverting 8613eda50488c27d848f8e8caa493c9d8e1b5271 fixes it,
but I haven't looked into the reason behind the bug.

Regards
-- 
Raúl Marín Rodríguez
carto.com




Re: Optimze usage of immutable functions as relation

2019-08-01 Thread Tom Lane
Anastasia Lubennikova  writes:
> Thank you for pointing out on specific of int4() function,
> I updated tests to use dummy plpgsql function.
> I'm not sure if tests of various join types are redundant but I left them.
> As far as I understand, the principal motivation of this patch was to 
> optimize
> function scan joins that occur in FTS queries. For example:
> select * from test_tsquery, to_tsquery('english', 'new') q where 
> txtsample @@ q;
> So I also added another test to tsearch.sql to illustrate difference 
> between optimized and not optimized plans.

Fair enough.

I've pushed this after a good deal of further hackery on the code.
Notably

* I had no faith that we still guaranteed to perform
eval_const_expressions on every function-RTE expression.  Even if
that were true today, it'd be awfully easy to break in future,
if it only happened somewhere down in the recursion in
pull_up_subqueries_recurse.  So I moved that responsibility to
an earlier spot that clearly traverses all function RTEs.
A happy side benefit is that inline_set_returning_function
gets simpler because it can assume that that already happened.

* It looked to me like the pullup_constant_function code wasn't
covering all the places where it might need to replace Vars
referencing the function RTE.  I refactored things to avoid
having a bunch of code duplication while ensuring it hits
everyplace that pull_up_simple_subquery does.

regards, tom lane




Re: Optimze usage of immutable functions as relation

2019-08-01 Thread Anastasia Lubennikova

26.07.2019 21:26, Tom Lane wrote:

I took a quick look at this and I have a couple of gripes ---

* The naming and documentation of transform_const_function_to_result
seem pretty off-point to me.  ISTM the real goal of that function is to
pull up constant values from RTE_FUNCTION RTEs.  Replacing the RTE
with an RTE_RESULT is just a side-effect that's needed so that we
don't generate a useless FunctionScan plan node.  I'd probably call
it pull_up_constant_function or something like that.

* It would be useful for the commentary to point out that in principle we
could pull up any immutable (or, probably, even just stable) expression;
but we don't, (a) for fear of multiple evaluations of the result costing
us more than we can save, and (b) because a primary goal is to let the
constant participate in further const-folding, and of course that won't
happen for a non-Const.

Fixed

* The test cases are really pretty bogus, because int4(1) or int4(0) are
not function invocations at all.  The parser thinks those are no-op type
casts, and so what comes out of parse analysis is already just a plain
Const.  Thus, not one of these test cases is actually verifying that
const-folding of an immutable function has happened before we try to
pull up.  While you could dodge the problem today by, say, writing
int8(0) which isn't a no-op cast, I'd recommend staying away from
typename() notation altogether.  There's too much baggage there and too
little certainty that you wrote a function call not something else.
The existing test cases you changed, with int4(sin(1)) and so on,
are better examples of something that has to actively be folded to
a constant.


Thank you for pointing out on specific of int4() function,
I updated tests to use dummy plpgsql function.
I'm not sure if tests of various join types are redundant but I left them.

As far as I understand, the principal motivation of this patch was to 
optimize

function scan joins that occur in FTS queries. For example:
select * from test_tsquery, to_tsquery('english', 'new') q where 
txtsample @@ q;
So I also added another test to tsearch.sql to illustrate difference 
between optimized and not optimized plans.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 36fefd9..acc3776 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1071,7 +1071,8 @@ preprocess_expression(PlannerInfo *root, Node *expr, int kind)
 		expr = flatten_join_alias_vars(root->parse, expr);
 
 	/*
-	 * Simplify constant expressions.
+	 * Simplify constant expressions. For function RTEs, this was already done
+	 * by pullup_simple_subqueries.
 	 *
 	 * Note: an essential effect of this is to convert named-argument function
 	 * calls to positional notation and insert the current actual values of
@@ -1085,7 +1086,8 @@ preprocess_expression(PlannerInfo *root, Node *expr, int kind)
 	 * careful to maintain AND/OR flatness --- that is, do not generate a tree
 	 * with AND directly under AND, nor OR directly under OR.
 	 */
-	expr = eval_const_expressions(root, expr);
+	if (kind != EXPRKIND_RTFUNC && kind != EXPRKIND_RTFUNC_LATERAL)
+		expr = eval_const_expressions(root, expr);
 
 	/*
 	 * If it's a qual or havingQual, canonicalize it.
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index 4fbc03f..e6e8fef 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -79,6 +79,9 @@ static Node *pull_up_simple_union_all(PlannerInfo *root, Node *jtnode,
 static void pull_up_union_leaf_queries(Node *setOp, PlannerInfo *root,
 	   int parentRTindex, Query *setOpQuery,
 	   int childRToffset);
+static void pull_up_constant_function(PlannerInfo *root, Node *jtnode,
+	 RangeTblEntry *rte,
+	 JoinExpr *lowest_nulling_outer_join);
 static void make_setop_translation_list(Query *query, Index newvarno,
 		List **translated_vars);
 static bool is_simple_subquery(Query *subquery, RangeTblEntry *rte,
@@ -754,6 +757,20 @@ pull_up_subqueries_recurse(PlannerInfo *root, Node *jtnode,
 			is_simple_values(root, rte))
 			return pull_up_simple_values(root, jtnode, rte);
 
+		/*
+		 * Or is it an immutable function that evaluated to a single Const?
+		 */
+		if (rte->rtekind == RTE_FUNCTION)
+		{
+			rte->functions = (List *)
+eval_const_expressions(root, (Node *) rte->functions);
+
+			pull_up_constant_function(root, jtnode, rte,
+	  lowest_nulling_outer_join);
+
+			return jtnode;
+		}
+
 		/* Otherwise, do nothing at this node. */
 	}
 	else if (IsA(jtnode, FromExpr))
@@ -1784,6 +1801,93 @@ is_simple_union_all_recurse(Node *setOp, Query *setOpQuery, List *colTypes)
 }
 
 /*
+ * pull_up_constant_function
+ *		Pull up an immutable function that was evaluated to a constant
+ *
+ * jtnode is a 

Re: Optimze usage of immutable functions as relation

2019-07-26 Thread Tom Lane
I wrote:
> * It would be useful for the commentary to point out that in principle we
> could pull up any immutable (or, probably, even just stable) expression;
> but we don't, (a) for fear of multiple evaluations of the result costing
> us more than we can save, and (b) because a primary goal is to let the
> constant participate in further const-folding, and of course that won't
> happen for a non-Const.

BTW, so far as I can see, none of the test cases demonstrate that such
further const-folding can happen.  Since this is all pretty processing-
order-sensitive, I think an explicit test that that's possible would
be a good idea.

regards, tom lane




Re: Optimze usage of immutable functions as relation

2019-07-26 Thread Tom Lane
Anastasia Lubennikova  writes:
> New version is in attachments.

I took a quick look at this and I have a couple of gripes ---

* The naming and documentation of transform_const_function_to_result
seem pretty off-point to me.  ISTM the real goal of that function is to
pull up constant values from RTE_FUNCTION RTEs.  Replacing the RTE
with an RTE_RESULT is just a side-effect that's needed so that we
don't generate a useless FunctionScan plan node.  I'd probably call
it pull_up_constant_function or something like that.

* It would be useful for the commentary to point out that in principle we
could pull up any immutable (or, probably, even just stable) expression;
but we don't, (a) for fear of multiple evaluations of the result costing
us more than we can save, and (b) because a primary goal is to let the
constant participate in further const-folding, and of course that won't
happen for a non-Const.

* The test cases are really pretty bogus, because int4(1) or int4(0) are
not function invocations at all.  The parser thinks those are no-op type
casts, and so what comes out of parse analysis is already just a plain
Const.  Thus, not one of these test cases is actually verifying that
const-folding of an immutable function has happened before we try to
pull up.  While you could dodge the problem today by, say, writing
int8(0) which isn't a no-op cast, I'd recommend staying away from
typename() notation altogether.  There's too much baggage there and too
little certainty that you wrote a function call not something else.
The existing test cases you changed, with int4(sin(1)) and so on,
are better examples of something that has to actively be folded to
a constant.

regards, tom lane




Re: Optimze usage of immutable functions as relation

2019-07-26 Thread Anastasia Lubennikova

23.07.2019 14:36, Anastasia Lubennikova :

08.07.2019 4:18, Thomas Munro:
The July Commitfest is here.  Could we please have a rebase of this 
patch?

Updated patch is in attachments.
I've only resolved one small cosmetic merge conflict.

Later this week I'm going to send a more thoughtful review.

Well, I looked through the patch and didn't find any issues, so I'll 
mark this ready for committer.


Last implementation differs from the original one and is based on 
suggestion from this message:

https://www.postgresql.org/message-id/2906.1542395026%40sss.pgh.pa.us

It does eval_const_expressions() earlier and pulls up only Const functions.

I also added a few more tests and comments.
New version is in attachments.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 36fefd9..acc3776 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1071,7 +1071,8 @@ preprocess_expression(PlannerInfo *root, Node *expr, int kind)
 		expr = flatten_join_alias_vars(root->parse, expr);
 
 	/*
-	 * Simplify constant expressions.
+	 * Simplify constant expressions. For function RTEs, this was already done
+	 * by pullup_simple_subqueries.
 	 *
 	 * Note: an essential effect of this is to convert named-argument function
 	 * calls to positional notation and insert the current actual values of
@@ -1085,7 +1086,8 @@ preprocess_expression(PlannerInfo *root, Node *expr, int kind)
 	 * careful to maintain AND/OR flatness --- that is, do not generate a tree
 	 * with AND directly under AND, nor OR directly under OR.
 	 */
-	expr = eval_const_expressions(root, expr);
+	if (kind != EXPRKIND_RTFUNC && kind != EXPRKIND_RTFUNC_LATERAL)
+		expr = eval_const_expressions(root, expr);
 
 	/*
 	 * If it's a qual or havingQual, canonicalize it.
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index 4fbc03f..f96b290 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -79,6 +79,9 @@ static Node *pull_up_simple_union_all(PlannerInfo *root, Node *jtnode,
 static void pull_up_union_leaf_queries(Node *setOp, PlannerInfo *root,
 	   int parentRTindex, Query *setOpQuery,
 	   int childRToffset);
+static void transform_const_function_to_result(PlannerInfo *root, Node *jtnode,
+	 RangeTblEntry *rte,
+	 JoinExpr *lowest_nulling_outer_join);
 static void make_setop_translation_list(Query *query, Index newvarno,
 		List **translated_vars);
 static bool is_simple_subquery(Query *subquery, RangeTblEntry *rte,
@@ -754,6 +757,18 @@ pull_up_subqueries_recurse(PlannerInfo *root, Node *jtnode,
 			is_simple_values(root, rte))
 			return pull_up_simple_values(root, jtnode, rte);
 
+		/*
+		 * Or is it an immutable function that evaluated to a single Const?
+		 */
+		if (rte->rtekind == RTE_FUNCTION)
+		{
+			rte->functions = (List *)
+eval_const_expressions(root, (Node *) rte->functions);
+			transform_const_function_to_result(root, jtnode, rte,
+			   lowest_nulling_outer_join);
+			return jtnode;
+		}
+
 		/* Otherwise, do nothing at this node. */
 	}
 	else if (IsA(jtnode, FromExpr))
@@ -1784,6 +1799,78 @@ is_simple_union_all_recurse(Node *setOp, Query *setOpQuery, List *colTypes)
 }
 
 /*
+ * If the function of this RTE_FUNCTION entry evaluated to a single Const
+ * after eval_const_expressions, transform it to RTE_RESULT.
+ */
+static void
+transform_const_function_to_result(PlannerInfo *root, Node *jtnode,
+   RangeTblEntry *rte,
+   JoinExpr *lowest_nulling_outer_join)
+{
+	ListCell *lc;
+	pullup_replace_vars_context rvcontext;
+	RangeTblFunction *rtf = (RangeTblFunction *) linitial(rte->functions);
+	Query *parse = root->parse;
+
+	if (!IsA(rtf->funcexpr, Const))
+		return;
+
+	/* Fail if the RTE has ORDINALITY - we don't implement that here. */
+	if (rte->funcordinality)
+		return;
+
+	/* Fail if RTE isn't a single, simple Const expr */
+	if (list_length(rte->functions) != 1)
+		return;
+
+	rvcontext.targetlist = list_make1(makeTargetEntry((Expr *) rtf->funcexpr,
+		1 /* resno */, NULL /* resname */, false /* resjunk */));
+	rvcontext.root = root;
+	rvcontext.target_rte = rte;
+
+	/*
+	 * Since this function was reduced to Const, it can't really have lateral
+	 * references, even if it's marked as LATERAL. This means we don't need
+	 * to fill relids.
+	 */
+	rvcontext.relids = NULL;
+
+	rvcontext.outer_hasSubLinks = >hasSubLinks;
+	rvcontext.varno = ((RangeTblRef *) jtnode)->rtindex;
+
+	/*
+	 * If this RTE is on a nullable side of an outer join, we have to insert
+	 * PHVs around our Consts so that they go to null when needed.
+	 */
+	rvcontext.need_phvs = lowest_nulling_outer_join != NULL;
+
+	rvcontext.wrap_non_vars = false;
+	rvcontext.rv_cache = palloc0((list_length(rvcontext.targetlist) + 

Re: Optimze usage of immutable functions as relation

2019-07-23 Thread Anastasia Lubennikova

08.07.2019 4:18, Thomas Munro:

The July Commitfest is here.  Could we please have a rebase of this patch?

Updated patch is in attachments.
I've only resolved one small cosmetic merge conflict.

Later this week I'm going to send a more thoughtful review.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

commit 959dab6ef869821e757f93a82de40b6bf883ad71
Author: Anastasia 
Date:   Tue Jul 23 14:22:18 2019 +0300

[PATCH v6] Simplify immutable RTE_FUNCTION to RTE_RESULT

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 36fefd9..acc3776 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1071,7 +1071,8 @@ preprocess_expression(PlannerInfo *root, Node *expr, int kind)
 		expr = flatten_join_alias_vars(root->parse, expr);
 
 	/*
-	 * Simplify constant expressions.
+	 * Simplify constant expressions. For function RTEs, this was already done
+	 * by pullup_simple_subqueries.
 	 *
 	 * Note: an essential effect of this is to convert named-argument function
 	 * calls to positional notation and insert the current actual values of
@@ -1085,7 +1086,8 @@ preprocess_expression(PlannerInfo *root, Node *expr, int kind)
 	 * careful to maintain AND/OR flatness --- that is, do not generate a tree
 	 * with AND directly under AND, nor OR directly under OR.
 	 */
-	expr = eval_const_expressions(root, expr);
+	if (kind != EXPRKIND_RTFUNC && kind != EXPRKIND_RTFUNC_LATERAL)
+		expr = eval_const_expressions(root, expr);
 
 	/*
 	 * If it's a qual or havingQual, canonicalize it.
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index 4fbc03f..fca78b7 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -79,6 +79,9 @@ static Node *pull_up_simple_union_all(PlannerInfo *root, Node *jtnode,
 static void pull_up_union_leaf_queries(Node *setOp, PlannerInfo *root,
 	   int parentRTindex, Query *setOpQuery,
 	   int childRToffset);
+static void transform_const_function_to_result(PlannerInfo *root, Node *jtnode,
+	 RangeTblEntry *rte,
+	 JoinExpr *lowest_nulling_outer_join);
 static void make_setop_translation_list(Query *query, Index newvarno,
 		List **translated_vars);
 static bool is_simple_subquery(Query *subquery, RangeTblEntry *rte,
@@ -754,6 +757,18 @@ pull_up_subqueries_recurse(PlannerInfo *root, Node *jtnode,
 			is_simple_values(root, rte))
 			return pull_up_simple_values(root, jtnode, rte);
 
+		/*
+		 * Or is it an immutable function that evaluated to a single Const?
+		 */
+		if (rte->rtekind == RTE_FUNCTION)
+		{
+			rte->functions = (List *)
+eval_const_expressions(root, (Node *) rte->functions);
+			transform_const_function_to_result(root, jtnode, rte,
+			   lowest_nulling_outer_join);
+			return jtnode;
+		}
+
 		/* Otherwise, do nothing at this node. */
 	}
 	else if (IsA(jtnode, FromExpr))
@@ -1784,6 +1799,76 @@ is_simple_union_all_recurse(Node *setOp, Query *setOpQuery, List *colTypes)
 }
 
 /*
+ * If the function of this RTE_FUNCTION entry evaluated to a single Const
+ * after eval_const_expressions, transform it to RTE_RESULT.
+ */
+static void
+transform_const_function_to_result(PlannerInfo *root, Node *jtnode,
+   RangeTblEntry *rte,
+   JoinExpr *lowest_nulling_outer_join)
+{
+	ListCell *lc;
+	pullup_replace_vars_context rvcontext;
+	RangeTblFunction *rtf = (RangeTblFunction *) linitial(rte->functions);
+	Query *parse = root->parse;
+
+	if (!IsA(rtf->funcexpr, Const))
+		return;
+
+	if (rte->funcordinality)
+		return;
+
+	if (list_length(rte->functions) != 1)
+		return;
+
+	rvcontext.targetlist = list_make1(makeTargetEntry((Expr *) rtf->funcexpr,
+		1 /* resno */, NULL /* resname */, false /* resjunk */));
+	rvcontext.root = root;
+	rvcontext.target_rte = rte;
+
+	/*
+	 * Since this function was reduced to Const, it can't really have lateral
+	 * references, even if it's marked as LATERAL. This means we don't need
+	 * to fill relids.
+	 */
+	rvcontext.relids = NULL;
+
+	rvcontext.outer_hasSubLinks = >hasSubLinks;
+	rvcontext.varno = ((RangeTblRef *) jtnode)->rtindex;
+
+	/*
+	 * If this RTE is on a nullable side of an outer join, we have to insert
+	 * PHVs around our Consts so that they go to null when needed.
+	 */
+	rvcontext.need_phvs = lowest_nulling_outer_join != NULL;
+
+	rvcontext.wrap_non_vars = false;
+	rvcontext.rv_cache = palloc0((list_length(rvcontext.targetlist) + 1)
+ * sizeof(Node *));
+
+	parse->targetList = (List *)
+			pullup_replace_vars((Node *) parse->targetList, );
+	parse->returningList = (List *)
+			pullup_replace_vars((Node *) parse->returningList, );
+	replace_vars_in_jointree((Node *) parse->jointree, , NULL);
+
+	foreach(lc, parse->rtable)
+	{
+		RangeTblEntry *otherrte = (RangeTblEntry *) lfirst(lc);
+
+		if (otherrte->rtekind == RTE_JOIN)
+			

Re: Optimze usage of immutable functions as relation

2019-07-07 Thread Thomas Munro
On Thu, Mar 21, 2019 at 5:58 AM Alexander Kuzmenkov
 wrote:
> On 11/16/18 22:03, Tom Lane wrote:
> > A possible fix for this is to do eval_const_expressions() on
> > function RTE expressions at this stage (and then not need to
> > do it later), and then pull up only when we find that the
> > RTE expression has been reduced to a single Const.
>
>
> Attached is a patch that does this, and transforms RTE_FUCTION that was
> reduced to a single Const into an RTE_RESULT.

Hi Alexander,

The July Commitfest is here.  Could we please have a rebase of this patch?

Thanks,

-- 
Thomas Munro
https://enterprisedb.com




Re: Optimze usage of immutable functions as relation

2019-03-20 Thread Alexander Kuzmenkov

On 11/16/18 22:03, Tom Lane wrote:

A possible fix for this is to do eval_const_expressions() on
function RTE expressions at this stage (and then not need to
do it later), and then pull up only when we find that the
RTE expression has been reduced to a single Const.



Attached is a patch that does this, and transforms RTE_FUCTION that was 
reduced to a single Const into an RTE_RESULT.


Not sure it does everything correctly, but some basic cases work. In 
particular, I don't understand whether it needs any handling of "append 
relations".



--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

>From 4921a58f38659580f76f9684e56c0546d46526bc Mon Sep 17 00:00:00 2001
From: Alexander Kuzmenkov 
Date: Wed, 20 Mar 2019 19:56:20 +0300
Subject: [PATCH v5] Simplify immutable RTE_FUNCTION to RTE_RESULT.

---
 src/backend/optimizer/plan/planner.c  |   6 +-
 src/backend/optimizer/prep/prepjointree.c |  85 ++
 src/test/regress/expected/join.out| 113 +++---
 src/test/regress/sql/join.sql |  44 ++--
 4 files changed, 234 insertions(+), 14 deletions(-)

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index e408e77..fe2f426 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1050,7 +1050,8 @@ preprocess_expression(PlannerInfo *root, Node *expr, int kind)
 		expr = flatten_join_alias_vars(root->parse, expr);
 
 	/*
-	 * Simplify constant expressions.
+	 * Simplify constant expressions. For function RTEs, this was already done
+	 * by pullup_simple_subqueries.
 	 *
 	 * Note: an essential effect of this is to convert named-argument function
 	 * calls to positional notation and insert the current actual values of
@@ -1064,7 +1065,8 @@ preprocess_expression(PlannerInfo *root, Node *expr, int kind)
 	 * careful to maintain AND/OR flatness --- that is, do not generate a tree
 	 * with AND directly under AND, nor OR directly under OR.
 	 */
-	expr = eval_const_expressions(root, expr);
+	if (kind != EXPRKIND_RTFUNC && kind != EXPRKIND_RTFUNC_LATERAL)
+		expr = eval_const_expressions(root, expr);
 
 	/*
 	 * If it's a qual or havingQual, canonicalize it.
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index aebe162..313c2bb 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -79,6 +79,9 @@ static Node *pull_up_simple_union_all(PlannerInfo *root, Node *jtnode,
 static void pull_up_union_leaf_queries(Node *setOp, PlannerInfo *root,
 		   int parentRTindex, Query *setOpQuery,
 		   int childRToffset);
+static void transform_const_function_to_result(PlannerInfo *root, Node *jtnode,
+	 RangeTblEntry *rte,
+	 JoinExpr *lowest_nulling_outer_join);
 static void make_setop_translation_list(Query *query, Index newvarno,
 			List **translated_vars);
 static bool is_simple_subquery(Query *subquery, RangeTblEntry *rte,
@@ -754,6 +757,18 @@ pull_up_subqueries_recurse(PlannerInfo *root, Node *jtnode,
 			is_simple_values(root, rte))
 			return pull_up_simple_values(root, jtnode, rte);
 
+		/*
+		 * Or is it an immutable function that evaluated to a single Const?
+		 */
+		if (rte->rtekind == RTE_FUNCTION)
+		{
+			rte->functions = (List *) 
+eval_const_expressions(root, (Node *) rte->functions);
+			transform_const_function_to_result(root, jtnode, rte,
+			   lowest_nulling_outer_join);
+			return jtnode;
+		}
+
 		/* Otherwise, do nothing at this node. */
 	}
 	else if (IsA(jtnode, FromExpr))
@@ -1783,6 +1798,76 @@ is_simple_union_all_recurse(Node *setOp, Query *setOpQuery, List *colTypes)
 }
 
 /*
+ * If the function of this RTE_FUNCTION entry evaluated to a single Const
+ * after eval_const_expressions, transform it to RTE_RESULT.
+ */
+static void
+transform_const_function_to_result(PlannerInfo *root, Node *jtnode,
+   RangeTblEntry *rte,
+   JoinExpr *lowest_nulling_outer_join)
+{
+	ListCell *lc;
+	pullup_replace_vars_context rvcontext;
+	RangeTblFunction *rtf = (RangeTblFunction *) linitial(rte->functions);
+	Query *parse = root->parse;
+
+	if (!IsA(rtf->funcexpr, Const))
+		return;
+
+	if (rte->funcordinality)
+		return;
+
+	if (list_length(rte->functions) != 1)
+		return;
+
+	rvcontext.targetlist = list_make1(makeTargetEntry((Expr *) rtf->funcexpr,
+		1 /* resno */, NULL /* resname */, false /* resjunk */));
+	rvcontext.root = root;
+	rvcontext.target_rte = rte;
+
+	/*
+	 * Since this function was reduced to Const, it can't really have lateral
+	 * references, even if it's marked as LATERAL. This means we don't need
+	 * to fill relids.
+	 */
+	rvcontext.relids = NULL;
+
+	rvcontext.outer_hasSubLinks = >hasSubLinks;
+	rvcontext.varno = ((RangeTblRef *) jtnode)->rtindex;
+
+	/*
+	 * If this RTE is on a nullable side of an outer join, we have to insert
+	 * PHVs around 

Re: Optimze usage of immutable functions as relation

2019-03-07 Thread Alexander Kuzmenkov

On 3/5/19 20:22, David Steele wrote:


I'll close this on March 8th if there is no new patch.



This is taking longer than expected. I'll move it to the next commitfest 
and continue working on the patch.


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Re: Optimze usage of immutable functions as relation

2019-03-05 Thread David Steele

On 2/28/19 4:27 PM, Alexander Kuzmenkov wrote:

On 2/18/19 03:20, Tom Lane wrote:

The dummy-relation stuff I referred to has now been merged, so there's
really no good reason not to revise the patch along that line.



I'll try to post the revised implementation soon.


I'll close this on March 8th if there is no new patch.

Regards,
--
-David
da...@pgmasters.net



Re: Optimze usage of immutable functions as relation

2019-02-28 Thread Alexander Kuzmenkov

On 2/18/19 03:20, Tom Lane wrote:

The dummy-relation stuff I referred to has now been merged, so there's
really no good reason not to revise the patch along that line.



I'll try to post the revised implementation soon.


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Optimze usage of immutable functions as relation

2019-02-17 Thread Tom Lane
Andres Freund  writes:
> Given this I think the appropriate state of the CF entry would have been
> waiting-for-author, not needs review. Or alternatively
> returned-with-feedback or rejected.  I'm a bit confused as to why the
> patch was moved to the next CF twice?

We have this review from Antonin, and mine in
https://www.postgresql.org/message-id/2906.1542395026%40sss.pgh.pa.us
and there's the cfbot report that the patch doesn't even apply anymore.

The dummy-relation stuff I referred to has now been merged, so there's
really no good reason not to revise the patch along that line.

I'm going to go set this CF entry to waiting-for-author, but unless
a rewritten patch appears soon, I think we should close it
returned-with-feedback.  I think the idea is potentially good, but
as I said in my review, I don't like this implementation; it has the
potential to be a net loss in some cases.

regards, tom lane



Re: Optimze usage of immutable functions as relation

2019-02-15 Thread Andres Freund
On 2018-11-08 15:08:03 +0100, Antonin Houska wrote:
> Aleksandr Parfenov  wrote:
> 
> > I fixed a typo and some comments. Please find new version attached.
> 
> I've checked this verstion too.
> 
> * is_simple_stable_function()
> 
> instead of fetching HeapTuple from the syscache manually, you might want to
> consider using functions from lsyscache.c (get_func_rettype, get_func_retset,
> etc.), or adding a function that returns (subset of) the fields you need in a
> single call.
> 
> * pull_up_simple_function():
> 
> As you assume that ret->functions is a single-item list
> 
>   Assert(list_length(rte->functions) == 1);
> 
> the following iteration is not necessary:
> 
>   foreach(lc, functions_list)
> 
> Also, there seems to be a lot of copy & paste from pull_up_simple_values(), so
> some refactoring would make sense.

Given this I think the appropriate state of the CF entry would have been
waiting-for-author, not needs review. Or alternatively
returned-with-feedback or rejected.  I'm a bit confused as to why the
patch was moved to the next CF twice?

- Andres



Re: Optimze usage of immutable functions as relation

2018-11-16 Thread Tom Lane
Aleksandr Parfenov  writes:
> [ funcscan_plan_optimizer_v4.patch ]

A small note here --- this would be noticeably cleaner if removal of
the no-longer-needed function RTE were done using the dummy-relation
infrastructure I proposed in

https://commitfest.postgresql.org/20/1827/

Maybe we should get that done first and then come back to this.

I'm a bit suspicious of the premise of this though, because it's
going to result in a single call of a function being converted
into possibly many calls, if the RTE is referenced in a lot of
places.  Even if the function is immutable so that those calls
get evaluated at plan time not run time, it's still N calls not
one, which'd be bad if the function is expensive.  See e.g.

https://www.postgresql.org/message-id/flat/CAOYf6edujEOGB-9fGG_V9PGQ5O9yoyWmTnE9RyBYTGw%2BDq3SpA%40mail.gmail.com

for an example where we're specifically telling people that
they can put a function into FROM to avoid multiple evaluation.
This patch breaks that guarantee.

A possible fix for this is to do eval_const_expressions() on
function RTE expressions at this stage (and then not need to
do it later), and then pull up only when we find that the
RTE expression has been reduced to a single Const.  I'm not
sure whether invoking eval_const_expressions() so early
would create any issues.  But if it can be made to work, this
would eliminate quite a lot of the ad-hoc conditions that
you have in the patch now.

regards, tom lane



Re: Optimze usage of immutable functions as relation

2018-11-08 Thread Antonin Houska
Aleksandr Parfenov  wrote:

> I fixed a typo and some comments. Please find new version attached.

I've checked this verstion too.

* is_simple_stable_function()

instead of fetching HeapTuple from the syscache manually, you might want to
consider using functions from lsyscache.c (get_func_rettype, get_func_retset,
etc.), or adding a function that returns (subset of) the fields you need in a
single call.

* pull_up_simple_function():

As you assume that ret->functions is a single-item list

Assert(list_length(rte->functions) == 1);

the following iteration is not necessary:

foreach(lc, functions_list)

Also, there seems to be a lot of copy & paste from pull_up_simple_values(), so
some refactoring would make sense.


-- 
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com



Re: Optimze usage of immutable functions as relation

2018-11-08 Thread Antonin Houska
Kyotaro HORIGUCHI  wrote:

> I had the following error with the following query.
> 
> =# explain select * from pg_stat_get_activity(NULL) a join log(10.0) p on 
> a.pid = p.p;
> ERROR:  no relation entry for relid 2
> 

I think that the problem is that RTE_VALUES is wrapped in a subquery by parser
while RTE_FUNCTION is not. Thus some additional processing is done by
pull_up_simple_subquery() for VALUES. What seems to make difference here is
the call of flatten_join_alias_vars() on the query targetlist, although
pull_up_simple_subquery() does it for other reasons.

Actually the comment about flatten_join_alias_vars() in
pull_up_simple_subquery() makes me think if it's o.k. that
pull_up_simple_function() sets rvcontext.need_phvs=false regardless strictness
of the function: I think PlaceHolderVar might be needed if the function is not
strict. (In contrast, pull_up_simple_values() does not have to care because
pull_up_simple_subquery() handles such cases when it's processing the owning
subquery).

-- 
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com



Re: Optimze usage of immutable functions as relation

2018-10-23 Thread Kyotaro HORIGUCHI
Hello.

# It might be better that you provided self-contained test case.

As the discussion between Heikki and Tom just upthread, it would
be doable but that usage isn't typical. The query would be
normally written as followings and they are transformed as
desired.

select '|'||subject||'|', ts_rank_cd(body_tsvector,q) from 
messages where body_tsvector @@ to_tsquery('tuple');

or (this doesn't look normal, thought..)

select '|'||subject||'|', ts_rank_cd(body_tsvector,q) from 
messages, (select to_tsquery('tuple') as q) q
where body_tsvector @@ q;

This means that the wanted pull up logic is almost already
there. You should look on how it is handled.


At Sat, 20 Oct 2018 01:31:04 +0700, Aleksandr Parfenov  wrote 
in 
> Hi,
> 
> Thank you for the review.
> I fixed a typo and some comments. Please find new version attached.

I had the following error with the following query.

=# explain select * from pg_stat_get_activity(NULL) a join log(10.0) p on 
a.pid = p.p;
ERROR:  no relation entry for relid 2


As Andrew pointed, you are missing many things to do.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Optimze usage of immutable functions as relation

2018-10-19 Thread Aleksandr Parfenov
Hi,

Thank you for the review.
I fixed a typo and some comments. Please find new version attached.

---
Best regards,
Parfenov Aleksandr

On Fri, 19 Oct 2018 at 16:40, Anthony Bykov  wrote:

> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, failed
> Implements feature:   not tested
> Spec compliant:   not tested
> Documentation:not tested
>
> Hello,
> I was trying to review your patch, but I couldn't install it:
>
> prepjointree.c: In function ‘pull_up_simple_function’:
> prepjointree.c:1793:41: error: ‘functions’ undeclared (first use in this
> function); did you mean ‘PGFunction’?
>   Assert(!contain_vars_of_level((Node *) functions, 0));
>
> Was it a typo?
>
> The new status of this patch is: Waiting on Author
>
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index cd6e119..4a9d74a 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -23,7 +23,9 @@
  */
 #include "postgres.h"
 
+#include "access/htup_details.h"
 #include "catalog/pg_type.h"
+#include "catalog/pg_proc.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/clauses.h"
@@ -35,6 +37,8 @@
 #include "parser/parse_relation.h"
 #include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
+#include "utils/syscache.h"
+#include "utils/lsyscache.h"
 
 
 typedef struct pullup_replace_vars_context
@@ -86,6 +90,8 @@ static bool is_simple_subquery(Query *subquery, RangeTblEntry *rte,
    bool deletion_ok);
 static Node *pull_up_simple_values(PlannerInfo *root, Node *jtnode,
 	  RangeTblEntry *rte);
+static Node *pull_up_simple_function(PlannerInfo *root, Node *jtnode,
+	  RangeTblEntry *rte);
 static bool is_simple_values(PlannerInfo *root, RangeTblEntry *rte,
  bool deletion_ok);
 static bool is_simple_union_all(Query *subquery);
@@ -598,6 +604,54 @@ inline_set_returning_functions(PlannerInfo *root)
 	}
 }
 
+static bool
+is_simple_stable_function(RangeTblEntry *rte)
+{
+	Form_pg_type type_form;
+	RangeTblFunction *tblFunction = linitial_node(RangeTblFunction, rte->functions);
+	FuncExpr   *expr = (FuncExpr *) tblFunction->funcexpr;
+	HeapTuple tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(expr->funcresulttype));
+	bool		result = false;
+
+	if (!HeapTupleIsValid(tuple))
+		elog(ERROR, "cache lookup failed for type %u", expr->funcresulttype);
+
+	type_form = (Form_pg_type) GETSTRUCT(tuple);
+
+	if (type_form->typtype == TYPTYPE_BASE &&
+		!type_is_array(expr->funcresulttype))
+	{
+		Form_pg_proc func_form;
+		ListCell   *arg;
+		bool		has_nonconst_input = false;
+		bool		has_null_input = false;
+
+		ReleaseSysCache(tuple);
+		tuple = SearchSysCache1(PROCOID, ObjectIdGetDatum(expr->funcid));
+		if (!HeapTupleIsValid(tuple))
+			elog(ERROR, "cache lookup failed for function %u", expr->funcid);
+		func_form = (Form_pg_proc) GETSTRUCT(tuple);
+
+		foreach(arg, expr->args)
+		{
+			if (IsA(lfirst(arg), Const))
+has_null_input |= ((Const *) lfirst(arg))->constisnull;
+			else
+has_nonconst_input = true;
+		}
+
+		result = func_form->prorettype != RECORDOID &&
+ func_form->prokind == PROKIND_FUNCTION &&
+ !func_form->proretset &&
+ func_form->provolatile == PROVOLATILE_IMMUTABLE &&
+ !has_null_input &&
+ !has_nonconst_input;
+	}
+
+	ReleaseSysCache(tuple);
+	return result;
+}
+
 /*
  * pull_up_subqueries
  *		Look for subqueries in the rangetable that can be pulled up into
@@ -728,6 +782,11 @@ pull_up_subqueries_recurse(PlannerInfo *root, Node *jtnode,
 			is_simple_values(root, rte, deletion_ok))
 			return pull_up_simple_values(root, jtnode, rte);
 
+		if (rte->rtekind == RTE_FUNCTION &&
+list_length(rte->functions) == 1 &&
+is_simple_stable_function(rte))
+			return pull_up_simple_function(root, jtnode, rte);
+
 		/* Otherwise, do nothing at this node. */
 	}
 	else if (IsA(jtnode, FromExpr))
@@ -1710,6 +1769,98 @@ pull_up_simple_values(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte)
 	return NULL;
 }
 
+static Node *
+pull_up_simple_function(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte)
+{
+	Query	   *parse = root->parse;
+	int			varno = ((RangeTblRef *) jtnode)->rtindex;
+	List	   *functions_list;
+	List	   *tlist;
+	AttrNumber	attrno;
+	pullup_replace_vars_context rvcontext;
+	ListCell   *lc;
+
+	Assert(rte->rtekind == RTE_FUNCTION);
+	Assert(list_length(rte->functions) == 1);
+
+	/*
+	 * Need a modifiable copy of the functions list to hack on, just in case it's
+	 * multiply referenced.
+	 */
+	functions_list = copyObject(rte->functions);
+
+	/*
+	 * The FUNCTION RTE can't contain any Vars of level zero, let alone any that
+	 * are join aliases, so no need to flatten join alias Vars.
+	 */
+	Assert(!contain_vars_of_level((Node *) functions_list, 0));
+
+	/*
+	 * Set up required context data for pullup_replace_vars.  In particular,
+	 * we have to make the FUNCTION 

Re: Optimze usage of immutable functions as relation

2018-10-19 Thread Anthony Bykov
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

Hello, 
I was trying to review your patch, but I couldn't install it:

prepjointree.c: In function ‘pull_up_simple_function’:
prepjointree.c:1793:41: error: ‘functions’ undeclared (first use in this 
function); did you mean ‘PGFunction’?
  Assert(!contain_vars_of_level((Node *) functions, 0));

Was it a typo?

The new status of this patch is: Waiting on Author


Re: Optimze usage of immutable functions as relation

2018-09-06 Thread Aleksandr Parfenov
On Tue, 10 Jul 2018 17:34:20 -0400
Tom Lane  wrote:

>Heikki Linnakangas  writes:
>> But stepping back a bit, it's a bit weird that we're handling this 
>> differently from VALUES and other subqueries. The planner knows how
>> to do this trick for simple subqueries:  
>
>> postgres=# explain select * from tenk1, (select abs(100)) as a (a)
>> where unique1 < a;
>>  QUERY PLAN
>> ---
>>   Seq Scan on tenk1  (cost=0.00..483.00 rows=100 width=248)
>> Filter: (unique1 < 100)
>> (2 rows)  
>
>> Note that it not only evaluated the function into a constant, but
>> also got rid of the join. For a function RTE, however, it can't do
>> that:  
>
>> postgres=# explain select * from tenk1, abs(100) as a (a) where
>> unique1 < a; QUERY PLAN
>> ---
>>   Nested Loop  (cost=0.00..583.01 rows= width=248)
>> Join Filter: (tenk1.unique1 < a.a)  
>> ->  Function Scan on a  (cost=0.00..0.01 rows=1 width=4)
>> ->  Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=244)  
>> (4 rows)  
>
>> Could we handle this in pull_up_subqueries(), similar to the 
>> RTE_SUBQUERY and RTE_VALUES cases?  
>
>Perhaps.  You could only do it for non-set-returning functions, which
>isn't the typical use of function RTEs, which is probably why we've not
>thought hard about it before.  I'm not sure what would need to happen
>for lateral functions.  Also to be considered, if it's not foldable to
>a constant, is whether we're risking evaluating it more times than
>before.
>
>   regards, tom lane

I reworked the patch and implemented processing of FuncScan in
pull_up_subqueries() in a way similar to VALUES processing. In order to
prevent folding of non-foldable functions it checks provolatile of the
function and are arguments const or not and return type to prevent
folding of SRF.

-- 
Aleksandr Parfenov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index c3f46a26c3..25539bbfae 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -23,7 +23,9 @@
  */
 #include "postgres.h"
 
+#include "access/htup_details.h"
 #include "catalog/pg_type.h"
+#include "catalog/pg_proc.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/clauses.h"
@@ -35,6 +37,8 @@
 #include "parser/parse_relation.h"
 #include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
+#include "utils/syscache.h"
+#include "utils/lsyscache.h"
 
 
 typedef struct pullup_replace_vars_context
@@ -86,6 +90,8 @@ static bool is_simple_subquery(Query *subquery, RangeTblEntry *rte,
    bool deletion_ok);
 static Node *pull_up_simple_values(PlannerInfo *root, Node *jtnode,
 	  RangeTblEntry *rte);
+static Node *pull_up_simple_function(PlannerInfo *root, Node *jtnode,
+	  RangeTblEntry *rte);
 static bool is_simple_values(PlannerInfo *root, RangeTblEntry *rte,
  bool deletion_ok);
 static bool is_simple_union_all(Query *subquery);
@@ -595,6 +601,54 @@ inline_set_returning_functions(PlannerInfo *root)
 	}
 }
 
+static bool
+is_simple_stable_function(RangeTblEntry *rte)
+{
+	Form_pg_type type_form;
+	RangeTblFunction *tblFunction = linitial_node(RangeTblFunction, rte->functions);
+	FuncExpr   *expr = (FuncExpr *) tblFunction->funcexpr;
+	HeapTuple tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(expr->funcresulttype));
+	bool		result = false;
+
+	if (!HeapTupleIsValid(tuple))
+		elog(ERROR, "cache lookup failed for type %u", expr->funcresulttype);
+
+	type_form = (Form_pg_type) GETSTRUCT(tuple);
+
+	if (type_form->typtype == TYPTYPE_BASE &&
+		!type_is_array(expr->funcresulttype))
+	{
+		Form_pg_proc func_form;
+		ListCell   *arg;
+		bool		has_nonconst_input = false;
+		bool		has_null_input = false;
+
+		ReleaseSysCache(tuple);
+		tuple = SearchSysCache1(PROCOID, ObjectIdGetDatum(expr->funcid));
+		if (!HeapTupleIsValid(tuple))
+			elog(ERROR, "cache lookup failed for function %u", expr->funcid);
+		func_form = (Form_pg_proc) GETSTRUCT(tuple);
+
+		foreach(arg, expr->args)
+		{
+			if (IsA(lfirst(arg), Const))
+has_null_input |= ((Const *) lfirst(arg))->constisnull;
+			else
+has_nonconst_input = true;
+		}
+
+		result = func_form->prorettype != RECORDOID &&
+ func_form->prokind == PROKIND_FUNCTION &&
+ !func_form->proretset &&
+ func_form->provolatile == PROVOLATILE_IMMUTABLE &&
+ !has_null_input &&
+ !has_nonconst_input;
+	}
+
+	ReleaseSysCache(tuple);
+	return result;
+}
+
 /*
  * pull_up_subqueries
  *		Look for subqueries in the rangetable that can be pulled up into
@@ -725,6 +779,11 @@ pull_up_subqueries_recurse(PlannerInfo *root, Node *jtnode,
 			is_simple_values(root, rte, deletion_ok))
 			return pull_up_simple_values(root, jtnode, rte);
 
+		if (rte->rtekind 

Re: Optimze usage of immutable functions as relation

2018-07-10 Thread Tom Lane
Heikki Linnakangas  writes:
> But stepping back a bit, it's a bit weird that we're handling this 
> differently from VALUES and other subqueries. The planner knows how to 
> do this trick for simple subqueries:

> postgres=# explain select * from tenk1, (select abs(100)) as a (a) where 
> unique1 < a;
>  QUERY PLAN
> ---
>   Seq Scan on tenk1  (cost=0.00..483.00 rows=100 width=248)
> Filter: (unique1 < 100)
> (2 rows)

> Note that it not only evaluated the function into a constant, but also 
> got rid of the join. For a function RTE, however, it can't do that:

> postgres=# explain select * from tenk1, abs(100) as a (a) where unique1 < a;
>  QUERY PLAN
> ---
>   Nested Loop  (cost=0.00..583.01 rows= width=248)
> Join Filter: (tenk1.unique1 < a.a)
> ->  Function Scan on a  (cost=0.00..0.01 rows=1 width=4)
> ->  Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=244)
> (4 rows)

> Could we handle this in pull_up_subqueries(), similar to the 
> RTE_SUBQUERY and RTE_VALUES cases?

Perhaps.  You could only do it for non-set-returning functions, which
isn't the typical use of function RTEs, which is probably why we've not
thought hard about it before.  I'm not sure what would need to happen for
lateral functions.  Also to be considered, if it's not foldable to a
constant, is whether we're risking evaluating it more times than before.

regards, tom lane



Re: Optimze usage of immutable functions as relation

2018-07-10 Thread Heikki Linnakangas

On 16/05/18 13:47, Aleksandr Parfenov wrote:

Hello,

I reworked a patch to make more stable in different cases. I decided to
use simplify_function instead of eval_const_expression to prevent
inlining of the function. The only possible outputs of the
simplify_function are Const node and NULL.


Hmm. That's still a bit inefficient, we'll try to simplify the function 
on every reference to it.


We already simplify functions in function RTEs, but we currently do it 
after preprocessing all other expressions in the query. See 
subquery_planner(), around comment "/* Also need to preprocess 
expressions within RTEs */". If we moved that so that we simplify 
expressions in the range table first, then in the code that you're 
adding to eval_const_expression_mutator(), you could just check if the 
function expression is already a Const.



But stepping back a bit, it's a bit weird that we're handling this 
differently from VALUES and other subqueries. The planner knows how to 
do this trick for simple subqueries:


postgres=# explain select * from tenk1, (select abs(100)) as a (a) where 
unique1 < a;

QUERY PLAN
---
 Seq Scan on tenk1  (cost=0.00..483.00 rows=100 width=248)
   Filter: (unique1 < 100)
(2 rows)

Note that it not only evaluated the function into a constant, but also 
got rid of the join. For a function RTE, however, it can't do that:


postgres=# explain select * from tenk1, abs(100) as a (a) where unique1 < a;
QUERY PLAN
---
 Nested Loop  (cost=0.00..583.01 rows= width=248)
   Join Filter: (tenk1.unique1 < a.a)
   ->  Function Scan on a  (cost=0.00..0.01 rows=1 width=4)
   ->  Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=244)
(4 rows)

Could we handle this in pull_up_subqueries(), similar to the 
RTE_SUBQUERY and RTE_VALUES cases?


- Heikki



Re: Optimze usage of immutable functions as relation

2018-05-16 Thread Aleksandr Parfenov
Hello,

I reworked a patch to make more stable in different cases. I decided to
use simplify_function instead of eval_const_expression to prevent
inlining of the function. The only possible outputs of the
simplify_function are Const node and NULL.

Also, I block pre-evaluation of functions with types other than
TYPTYPE_BASE, cause there is no special logic for compound (and others)
values yet.

There is still a problem with memory leak in case of simplified
arguments. The only way I see is a creation of temporary memory context,
but it cost some performance. Maybe we can store simplified arguments
in the pointed function itself for later use. But eval_const_expression
and friends doesn't change the content of the nodes inside the tree, it
generates new nodes and returns it as a result.

The last point to mention is a fixed plan for the query in the initial
letter of the thread. As I mentioned before, new versions of the patch
replace var not with a function call, but with a function execution
result. After the patch, the following plan is used instead of Nested
Loop with Sequence Scan:

explain select '|'||subject||'|', ts_rank_cd(body_tsvector,q) from messages, 
to_tsquery('english', 'tuple') q where body_tsvector @@ q limit 
10;
 QUERY PLAN

 Limit  (cost=224.16..266.11 rows=3 width=36)
   ->  Nested Loop  (cost=224.16..266.11 rows=3 width=36)
 ->  Function Scan on q  (cost=0.00..0.01 rows=1 width=0)
 ->  Bitmap Heap Scan on messages  (cost=224.16..266.04 rows=3 
width=275)
   Recheck Cond: (body_tsvector @@ '''tupl'' & ''header'' & 
''overhead'''::tsquery)
   ->  Bitmap Index Scan on message_body_idx  (cost=0.00..224.16 
rows=3 width=0)
 Index Cond: (body_tsvector @@ '''tupl'' & ''header'' & 
''overhead'''::tsquery)

-- 
Aleksandr Parfenov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 505ae0af85..2c9983004a 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -3655,6 +3655,59 @@ eval_const_expressions_mutator(Node *node,
 	  context);
 			}
 			break;
+		case T_Var:
+			if (context->root && context->root->parse->rtable)
+			{
+Var		   *var;
+Query	   *query;
+RangeTblEntry *pointedNode;
+
+var = (Var *)node;
+query = context->root->parse;
+
+if (var->varlevelsup != 0 || var->varattno != 1)
+	break;
+
+pointedNode = list_nth(query->rtable, var->varno - 1);
+Assert(IsA(pointedNode, RangeTblEntry));
+
+if (pointedNode->rtekind == RTE_FUNCTION && list_length(pointedNode->functions) == 1)
+{
+	Form_pg_type type_form;
+	Node	   *result;
+	RangeTblFunction *tblFunction = linitial_node(RangeTblFunction, pointedNode->functions);
+	FuncExpr   *expr = (FuncExpr *) tblFunction->funcexpr;
+	List	   *args = expr->args;
+	HeapTuple type_tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(expr->funcresulttype));
+
+	if (!HeapTupleIsValid(type_tuple))
+		elog(ERROR, "cache lookup failed for type %u", expr->funcresulttype);
+
+	type_form = (Form_pg_type) GETSTRUCT(type_tuple);
+
+	if (type_form->typtype != TYPTYPE_BASE)
+	{
+		ReleaseSysCache(type_tuple);
+		break;
+	}
+
+	result = simplify_function(expr->funcid,
+			   expr->funcresulttype,
+			   exprTypmod(expr),
+			   expr->funccollid,
+			   expr->inputcollid,
+			   ,
+			   expr->funcvariadic,
+			   true,
+			   false,
+			   context);
+
+	ReleaseSysCache(type_tuple);
+
+	if (result) /* successfully simplified it */
+		return (Node *) result;
+			}
+			break;
 		default:
 			break;
 	}
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index cbc882d47b..eb92f556d8 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3021,23 +3021,23 @@ select * from
   int4(sin(1)) q1,
   int4(sin(0)) q2
 where q1 = thousand or q2 = thousand;
-   QUERY PLAN   
-
- Hash Join
-   Hash Cond: (tenk1.twothousand = int4_tbl.f1)
+   QUERY PLAN
+-
+ Nested Loop
->  Nested Loop
- ->  Nested Loop
-   ->  Function Scan on q1
-   ->  Function Scan on q2
- ->  Bitmap Heap Scan on tenk1
-   Recheck Cond: ((q1.q1 = thousand) OR (q2.q2 = thousand))
-   ->  BitmapOr
- ->  Bitmap Index Scan on tenk1_thous_tenthous
- 

Re: Optimze usage of immutable functions as relation

2018-05-08 Thread Andrew Gierth
> "Aleksandr" == Aleksandr Parfenov  writes:

 >> From an implementation point of view your patch is obviously broken
 >> in many ways (starting with not checking varattno anywhere, and not
 >> actually checking anywhere if the expression is volatile).

 Aleksandr> The actual checking if the expression volatile or not is
 Aleksandr> done inside evaluate_function(). This is called through few
 Aleksandr> more function in eval_const_experssion(). If it's volatile,
 Aleksandr> the eval_const_expression() will return FuncExpr node, Const
 Aleksandr> otherwise. It also checks are arguments immutable or not.

You're missing a ton of other possible cases, of which by far the most
notable is function inlining: eval_const_expressions will inline even a
volatile function and return a new expression tree (which could be
almost anything depending on the function body).

 Aleksandr> I agree about varattno, it should be checked. Even in case
 Aleksandr> of SRF not replaced, it is better to be sure that Var points
 Aleksandr> to first (and the only) attribute.

It's not a matter of "better", but of basic correctness. Functions can
return composite values with columns.

-- 
Andrew (irc:RhodiumToad)



Re: Optimze usage of immutable functions as relation

2018-05-08 Thread Aleksandr Parfenov
Hello Andrew,

Thank you for the review of the patch.

On Fri, 04 May 2018 08:37:31 +0100
Andrew Gierth  wrote:
> From an implementation point of view your patch is obviously broken in
> many ways (starting with not checking varattno anywhere, and not
> actually checking anywhere if the expression is volatile).

The actual checking if the expression volatile or not is done inside
evaluate_function(). This is called through few more function in
eval_const_experssion(). If it's volatile, the eval_const_expression()
will return FuncExpr node, Const otherwise. It also checks are arguments
immutable or not.

I agree about varattno, it should be checked. Even in case of SRF not
replaced, it is better to be sure that Var points to first (and the
only) attribute.

> But more importantly the plans that you showed seem _worse_ in that
> you've created extra calls to to_tsquery even though the query has
> been written to try and avoid that.
> 
> I think you need to take a step back and explain more precisely what
> you think you're trying to do and what the benefit (and cost) is.

Sure, the first version is some kind of prototype and attempt to
improve execution of the certain type of queries. The best solution I
see is to use the result of the function where it is required and remove
the nested loop in case of immutable functions. In this case, the
planner can choose a better plan if function result is used for tuples
selecting or as a join filter. But it will increase planning time due to
the execution of immutable functions.

It looks like I did something wrong with plans in the example, because
attached patch replaces function-relation usage with result value, not
function call. But nested loop is still in the plan.

I'm open to any suggestions/notices/critics about the idea and approach.

-- 
Aleksandr Parfenov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: Optimze usage of immutable functions as relation

2018-05-04 Thread Andrew Gierth
> "Aleksandr" == Aleksandr Parfenov  writes:

 Aleksandr> The idea of the fix for this situation is to check is a
 Aleksandr> result of the function constant or not during the planning
 Aleksandr> of the query. Attached patch does this by processing Var
 Aleksandr> entries at planner stage and replace them with constant
 Aleksandr> value if it is possible. Plans after applying a patch
 Aleksandr> (SeqScan query for comparison):

>From an implementation point of view your patch is obviously broken in
many ways (starting with not checking varattno anywhere, and not
actually checking anywhere if the expression is volatile).

But more importantly the plans that you showed seem _worse_ in that
you've created extra calls to to_tsquery even though the query has been
written to try and avoid that.

I think you need to take a step back and explain more precisely what you
think you're trying to do and what the benefit (and cost) is.

Specific coding style points (not exhaustive):

 Aleksandr>  pointedNode->functions->length == 1

list_length()

 Aleksandr> pointedNode->functions->head->data.ptr_value

linitial() or linitial_node()

 Aleksandr> if (result->type == T_FuncExpr)

IsA()

(what if the result is the inline expansion of a volatile function?)

 Aleksandr> pfree(result);

result is a whole tree of nodes, pfree is pointless here

(don't bother trying to fix the above points in this patch, that won't
address the fundamental flaws)

-- 
Andrew (irc:RhodiumToad)



Optimze usage of immutable functions as relation

2018-05-03 Thread Aleksandr Parfenov
Hi hackers,

There is a strange behavior of the query planner in some cases if
stable/immutable was used a relation. In some cases, it affects costs of
operations and leads to a bad plan of the execution. Oleg Bartunov 
noticed
such behavior in queries with a to_tsvector as a relation:

=# explain select '|'||subject||'|', ts_rank_cd(body_tsvector,q) from 
messages, to_tsquery('tuple') q where body_tsvector @@ 
q;
 QUERY PLAN
--
  Nested Loop  (cost=383.37..58547.70 rows=4937 width=36)
->  Function Scan on to_tsquery q  (cost=0.25..0.26 rows=1 width=32)
->  Bitmap Heap Scan on messages  (cost=383.12..58461.04 rows=4937 
width=275)
  Recheck Cond: (body_tsvector @@ q.q)
  ->  Bitmap Index Scan on message_body_idx  (cost=0.00..381.89 
rows=4937 width=0)
Index Cond: (body_tsvector @@ q.q)
(6 rows)

=# explain select '|'||subject||'|', ts_rank_cd(body_tsvector,q) from 
messages, to_tsquery('tuple') q where body_tsvector @@
q limit 10;
QUERY PLAN

  Limit  (cost=0.25..425.62 rows=10 width=36)
->  Nested Loop  (cost=0.25..210005.80 rows=4937 width=36)
  Join Filter: (messages.body_tsvector @@ q.q)
  ->  Function Scan on to_tsquery q  (cost=0.25..0.26 rows=1 
width=32)
  ->  Seq Scan on messages  (cost=0.00..197625.45 rows=987445 
width=275)

The idea of the fix for this situation is to check is a result of the
function constant or not during the planning of the query. Attached 
patch does
this by processing Var entries at planner stage and replace them with
constant value if it is possible. Plans after applying a patch (SeqScan
query for comparison):

=# explain select '|'||subject||'|', ts_rank_cd(body_tsvector,q) from 
messages, to_tsquery('tuple') q where body_tsvector @@
q limit 10;
   QUERY PLAN
--
  Limit  (cost=224.66..268.11 rows=3 width=36)
->  Nested Loop  (cost=224.66..268.11 rows=3 width=36)
  ->  Function Scan on to_tsquery q  (cost=0.25..0.26 rows=1 
width=0)
  ->  Bitmap Heap Scan on messages  (cost=224.41..267.04 rows=3 
width=275)
Recheck Cond: (body_tsvector @@ 
to_tsquery('tuple'::text))
->  Bitmap Index Scan on message_body_idx  
(cost=0.00..224.41 rows=3 width=0)
  Index Cond: (body_tsvector @@ 
to_tsquery('tuple'::text))
(7 rows)

=# set enable_bitmapscan=off;
SET
=# explain select '|'||subject||'|', ts_rank_cd(body_tsvector,q) from 
messages, to_tsquery('tuple') q where body_tsvector @@
q limit 10;
 QUERY PLAN
--
  Limit  (cost=1000.25..296754.14 rows=3 width=36)
->  Gather  (cost=1000.25..296754.14 rows=3 width=36)
  Workers Planned: 2
  ->  Nested Loop  (cost=0.25..295753.32 rows=1 width=36)
->  Parallel Seq Scan on messages
(cost=0.00..295752.80 rows=1 width=275)
  Filter: (body_tsvector @@ 
to_tsquery('tuple'::text))
->  Function Scan on to_tsquery q  (cost=0.25..0.26 
rows=1 width=0)
(7 rows)

-- 
Aleksandr Parfenov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 505ae0af85..410a14ed95 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -3655,6 +3655,40 @@ eval_const_expressions_mutator(Node *node,
 	  context);
 			}
 			break;
+		case T_Var:
+			if (context->root && context->root->parse->rtable)
+			{
+Var		   *var;
+Query	   *query;
+RangeTblEntry *pointedNode;
+
+var = (Var *)node;
+query = context->root->parse;
+
+if (var->varlevelsup != 0)
+	break;
+
+pointedNode = list_nth(query->rtable, var->varno - 1);
+Assert(IsA(pointedNode, RangeTblEntry));
+
+if (pointedNode->rtekind == RTE_FUNCTION && pointedNode->functions->length == 1)
+{
+	Node	   *result;
+	RangeTblFunction *tblFunction = pointedNode->functions->head->data.ptr_value;
+
+	Assert(IsA(tblFunction, RangeTblFunction));
+	result = eval_const_expressions(context->root, tblFunction->funcexpr);
+
+	if (result->type == T_FuncExpr)
+	{
+		pfree(result);
+		return node;
+	}
+
+	return result;
+}
+			}
+			break;
 		default:
 			break;
 	}
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index cbc882d47b..eb92f556d8 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3021,23 +3021,23 @@