Re: [GENERAL] Horrible/never returning performance using stable function on WHERE clause

2016-03-29 Thread Achilleas Mantzios

Hello David

On 29/03/2016 14:04, David Rowley wrote:

On 29 March 2016 at 20:01, Achilleas Mantzios
 wrote:

We are using PostgreSQL 9.3. Shouldn't the optimizer use a single call to
get_machdef_sister_defids in get_machdef_sister_defids_maxrh ??

It shouldn't be up to the optimizer to evaluate a STABLE function.
Only IMMUTABLE functions will be evaluated during planning.
What's not that clear to me is if the planner might be able to work a
bit harder to create an "Initplan" for stable functions with Const
arguments. Right now I can't quite see a reason why that couldn't be
improved upon, after all, the documentation does claim that a STABLE
function during a "single table scan it will consistently return the
same result for the same argument values".

And to add here the docs 
(http://www.postgresql.org/docs/9.3/static/xfunc-volatility.html) also say :
"A STABLE function cannot modify the database and is guaranteed to return the same results given the same arguments for all rows within a single statement. *This category allows the optimizer to 
optimize multiple calls of the function to a single call*. In particular, it is safe to use an expression containing such a function in an index scan condition."

However it would be quite simple just for you to force the STABLE
function to be evaluated once, instead of once per row, just by
modifying your query to become:

select max(rh) into tmp from items where vslwhid=vvslid and
itoar(defid) ~ (select get_machdef_sister_defids(vdefid));

Viewing the EXPLAIN of this, you'll notice the InitPlan, which will
evaluate the function and allow the use the output value as a
parameter in the main query.

That's true, this worked indeed. But still cannot understand why the 
distinction between ~ get_machdef_sister_defids(...) and  ~ (SELECT 
get_machdef_sister_defids(...)).
Why is the planner forced in the second case and not in the first, since 
clearly the input argument is not dependent on any query result? (judging by 
the docs).


--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt



Re: [GENERAL] Horrible/never returning performance using stable function on WHERE clause

2016-03-29 Thread David Rowley
On 29 March 2016 at 20:01, Achilleas Mantzios
 wrote:
> We are using PostgreSQL 9.3. Shouldn't the optimizer use a single call to
> get_machdef_sister_defids in get_machdef_sister_defids_maxrh ??

It shouldn't be up to the optimizer to evaluate a STABLE function.
Only IMMUTABLE functions will be evaluated during planning.

What's not that clear to me is if the planner might be able to work a
bit harder to create an "Initplan" for stable functions with Const
arguments. Right now I can't quite see a reason why that couldn't be
improved upon, after all, the documentation does claim that a STABLE
function during a "single table scan it will consistently return the
same result for the same argument values".

However it would be quite simple just for you to force the STABLE
function to be evaluated once, instead of once per row, just by
modifying your query to become:

select max(rh) into tmp from items where vslwhid=vvslid and
itoar(defid) ~ (select get_machdef_sister_defids(vdefid));

Viewing the EXPLAIN of this, you'll notice the InitPlan, which will
evaluate the function and allow the use the output value as a
parameter in the main query.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Horrible/never returning performance using stable function on WHERE clause

2016-03-29 Thread Achilleas Mantzios

Hello list,
I have written some functions to extract some data from our DB, from an hierarchical structure, the problem is that if a function doing lookups is defined as STABLE in the WHERE clause the performance 
is horrible. What I am trying to achieve is given a specific node in an hierarchical structure (a machine definition) to find all its equivalent sister nodes and then for a specific instance of this 
hierarchy to find the max RH (running hours) among all sister nodes.I am using some functions/opers from intarray. Here are the functions :


Compares two nodes for sister property:

CREATE OR REPLACE FUNCTION public.is_defid_sister_node(vdefid1 integer, vdefid2 
integer)
 RETURNS boolean
 LANGUAGE plpgsql
 STABLE
AS $function$DECLARE
vparents1 INTEGER[];
vparents2 INTEGER[];
descr1 TEXT;
descr2 TEXT;
i INTEGER;
BEGIN

SELECT 
COALESCE(partid,0)||'__'||regexp_replace(coalesce(description,''),'[nN][oO]([0-9]+)',''),parents
 into descr1,vparents1 FROM machdefs where defid=vdefid1;
SELECT 
COALESCE(partid,0)||'__'||regexp_replace(coalesce(description,''),'[nN][oO]([0-9]+)',''),parents
 into descr2,vparents2 FROM machdefs where defid=vdefid2;

IF (level(vparents1) = 0 AND level(vparents2) = 0) THEN
RETURN vdefid1=vdefid2;
ELSIF (level(vparents1) <> level(vparents2)) THEN
RETURN false;
ELSE
RETURN ((descr1=descr2) AND 
is_defid_sister_node(first(vparents1),first(vparents2)));
END IF;

END;$function$

Finds the set of sister nodes for a given node:

CREATE OR REPLACE FUNCTION public.get_machdef_sister_defids(vdefid integer)
 RETURNS INTEGER[]
 LANGUAGE plpgsql
 STABLE
AS $function$
DECLARE
tmp INTEGER[];
BEGIN

select (select array_agg(mdsis.defid) FROM machdefs mdsis WHERE mdsis.machtypeid=md.machtypeid AND level(mdsis.parents)=level(md.parents) AND last(mdsis.parents)=last(md.parents) AND 
is_defid_sister_node(mdsis.defid,md.defid)  ) INTO tmp from machdefs md where md.defid=vdefid;


IF (tmp IS NULL) THEN
tmp := '{}';
END IF;
RETURN tmp;

END;
$function$

Finds max RH for a given tree instance among all sister nodes of a given node :

CREATE OR REPLACE FUNCTION public.get_machdef_sister_defids_maxrh(vvslid 
INTEGER,vdefid INTEGER)
 RETURNS INTEGER
 LANGUAGE plpgsql
 STABLE
AS $function$
DECLARE
tmp INTEGER;
BEGIN
select max(rh) into tmp from items where vslwhid=vvslid and 
itoar(defid) ~ get_machdef_sister_defids(vdefid);
RETURN tmp;
END;
$function$


Query :
select get_machdef_sister_defids_maxrh(479,319435);

never ends (I have waited till 2-3 minutes), however, *doing the wrong thing* 
and declaring get_machdef_sister_defids as IMMUTABLE makes the above call 
return fast :

# select get_machdef_sister_defids_maxrh(479,319435);
 get_machdef_sister_defids_maxrh
-
   10320
(1 row)

Time: 110.211 ms

We are using PostgreSQL 9.3. Shouldn't the optimizer use a single call to
get_machdef_sister_defids in get_machdef_sister_defids_maxrh ??

Defining get_machdef_sister_defids back to STABLE and forcing 
get_machdef_sister_defids_maxrh to only call get_machdef_sister_defids once 
makes things work again :

CREATE OR REPLACE FUNCTION public.get_machdef_sister_defids_maxrh(vvslid 
integer, vdefid integer)
 RETURNS integer
 LANGUAGE plpgsql
 STABLE
AS $function$
DECLARE
tmp INTEGER;
tmppars INTEGER[];
BEGIN
tmppars := get_machdef_sister_defids(vdefid);
select max(rh) into tmp from items where vslwhid=vvslid and 
itoar(defid) ~ tmppars;
RETURN tmp;
END;
$function$

# select get_machdef_sister_defids_maxrh(479,319435);
 get_machdef_sister_defids_maxrh
-
   10320
(1 row)

Time: 111.318 ms

Is this expected ?

--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general