Hello all, long time no chit-chat on the PG mailing list.  We're upgrading from 
8.0.3 to 8.3 and found that some stored procedures utilizing int_agg that we 
had left over from 7.3 had terrible performance.  No problem, using ANY() we're 
able to regain that performance, more or less, and at the same time greatly 
simplify our stored procedures.  But things can never be fast enough, can they? 
 So I have a question or two.  Here's my function for reference:

CREATE OR REPLACE FUNCTION bg_nodes2descendants(INT[], INT) RETURNS SETOF 
connection_generation AS
'

DECLARE
_row connection_generation%ROWTYPE;
_children INT[];

BEGIN

-- this is faster than constructing in the loop below
--_children = array(SELECT connectee_node_id FROM connections WHERE 
connection_type_id = 1 AND connector_node_id = ANY($1));

FOR _row IN
    SELECT connection_id, connection_type_id, connector_node_id, 
connector_node_type_id, connectee_node_id,
            connectee_node_type_id, current, timestamp, $2 + 1
        FROM connections WHERE connection_type_id = 1 AND connector_node_id = 
ANY($1)
    LOOP
        _children := _children || _row.connectee_node_id;
        RETURN NEXT _row;
    END LOOP;

IF FOUND THEN
    RETURN QUERY SELECT * FROM bg_nodes2descendants(_children, $2+1);
END IF;

RETURN;
END

' LANGUAGE 'plpgsql';

     So, my concern is alluded to in the comment above.  When I use this 
function in places where it returns large results, building the _children array 
directly (in the commented out line) is about 25% faster.  But I'd like to 
avoid building the children array altogether and would instead like to generate 
that array from the already collected output rows.  For example, right before 
the recursive call, I'd like to select a column of the buffered output rows, 
cast it to an integer[], and pass it into the recursive call.  Is there an 
internal value I can access for this such as:

_children := array(SELECT connectee_node_id FROM $output);

Bonus question -  if I rewrite the first FOR loop as:

RETURN QUERY SELECT connection_id, connection_type_id, connector_node_id, 
connector_node_type_id, connectee_node_id,
                             connectee_node_type_id, current, timestamp, $2 + 1 
FROM connections
                 WHERE connection_type_id = 1 AND connector_node_id = ANY($1);

I get "ERROR:  structure of query does not match function result type", even 
though the type signatures of the returned columns match the 
"connection_generation" rowtype.  I am pretty sure this could be resolved by 
casting the resulting columns to that row type, but I am lost as to how the 
syntax to do such a thing would look.

Thanks in advance for the help, and keep up the great work.  PG8.3 is an 
amazing piece of software and it blows me away how much more advanced it gets 
with every release.

Bart Grantham
VP of R&D
Logicworks Inc. - Complex and Managed Hosting

Reply via email to