Hello guys,
I don't know if this is the correct list. Correct me if I'm wrong. I have a directed graph, or better, a tree in postgresql 8.3. One table are the nodes and another one are the connections. Given any node, I need to get all nodes down to it(to the leafs) that have relation with anotherTable. Also, this connection change on time, so I have a connectionDate and a disconnectionDate for each connection (which might be null to represent open interval). This way, I wrote a pgsql function (I rename the tables and columns to generic names). These are the tables and the function: CREATE TABLE node ( id_node integer NOT NULL, CONSTRAINT node_pkey PRIMARY KEY (id_node) ); CREATE TABLE anotherTable ( id_anotherTable integer NOT NULL, id_node integer NOT NULL, CONSTRAINT anothertable_pkey PRIMARY KEY (id_anotherTable), CONSTRAINT anothertable_node_fkey FOREIGN KEY (id_node) REFERENCES node. id_node MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION ); CREATE TABLE connection ( id_connection integer NOT NULL, down integer NOT NULL, up integer NOT NULL, connectionDate timestamp with time zone, disconnectionDate timestamp with time zone, CONSTRAINT connection_pkey PRIMARY KEY (id_connection), CONSTRAINT down_fkey FOREIGN KEY (down) REFERENCES (id_node) REFERENCES node. id_node MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION, CONSTRAINT up_fkey FOREIGN KEY (up) REFERENCES (id_node) REFERENCES node. id_node MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION); CREATE TABLE observation ( id_observation integer NOT NULL, id_node integer NOT NULL, date timestamp with time zone, CONSTRAINT observation_pkey PRIMARY KEY (id_observation), CONSTRAINT observation_node_fkey FOREIGN KEY (id_node) REFERENCES node. id_node MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION ); CREATE OR REPLACE FUNCTION get_nodes_related_to_anothertable(integer,timestamp with time zone) RETURNS SETOF integer AS 'DECLARE _id ALIAS FOR $1; _date ALIAS FOR $2; _conn RECORD; BEGIN return query SELECT 1 FROM anothertable WHERE id_node = _id; FOR _ conn IN SELECT * FROM connection c where c.up = _id LOOP if _conn. connectionDate > _date then continue; end if; if _conn. disconnectionDate < _data then continue; end if; return query SELECT * from get_nodes_related_to_anothertable(_conn.down, _date); END LOOP; END' LANGUAGE 'plpgsql' IMMUTABLE; And I use it on my SELECT: SELECT * FROM (SELECT id_node, date FROM observation ) root_node_obs, node, anotherTable WHERE anotherTable.id_node = node.id_node AND node.id_node IN ( select * from get_nodes_related_to_anothertable(root_node_obs .id_node,root_node_obs .date)); Even with IMMUTABLE on the function, postgresql executes the function many times with the same parameters. In a single run: select * from get_nodes_related_to_anothertable(236,now()); it returns 5 rows and runs in 27ms. But in the bigger SELECT, it take 5s to each observation row (and I may have some :-) ). I know that IN generally is not optimization-friendly but I don't know how to use my function without it. Any clues guys? Thanks,