On Mon, 2007-01-15 at 15:05 -0500, Tom Lane wrote:
> Brian Hurt <[EMAIL PROTECTED]> writes:
> > Non-developer here, but we use a lot of plpgsql functions at work. And
> > the functions we use fall into two broad, ill-defined catagories-
> > "expensive" functions and "cheap" functions. What I'd like as a user is
> > some way to tell the planner "this function is expensive- prefer plans
> > which call this function less even if they're otherwise more expensive"
> > or "this function is cheap, prefer plans that are otherwise less
> > expensive even if they call this function more often". Precise cost
> > estimates aren't that important, IMHO.
> Right, so a plain constant cost would be plenty for your situation.
> I suspect there's an 80/20 rule at work here --- the estimator-function
> side of this will take most of the effort to design/implement, but not
> get used nearly as much as the plain-constant form ... maybe we should
> just do the constant for starters and see how many people really want to
> write C-code estimators ...
> regards, tom lane
Hi Tom et al,
Having worked with stored procedures on large datasets for reporting, I
would say that it would be useful to have a non-constant estimator for
the number of rows, whereas a single CPU cost constant should be fine.
Where I have struggled with this has been joining onto slightly more
exotic queries when doing large scale data processing as part of a
custom report or an application upgrade.
Using PL/PGSQL I would find it useful to have access to the constants
passed into a function to be used to help provide a row count estimate
(typically useful for passing in table/column names), e.g.
SELECT * FROM my_func('my_table1') AS t1, my_table2 AS t2 WHERE t1.id =
CREATE FUNCTION my_func(text) AS $$
$$ LANGUAGE 'plpgsql' COST 1.0 ROWS my_func_row_cost;
In my cost function, I could then estimate the number of rows using
something like below, where all constants are passed into the cost
function as parameters, e.g.:
CREATE FUNCTION my_func_row_cost(text) AS $$
EXECUTE INTO foo 'SELECT COUNT(*) FROM ' || quote_literal($1);
$$ LANGUAGE 'plpgsql';
In the case where a parameter was not a constant but a column name, then
it would be reasonable in my mind to simply replace that parameter with
NULL when passing to the row cost function, e.g.
SELECT * FROM my_table1 WHERE my_table1.junk = (SELECT
In this case, the text parameter passed to my_func_row_cost would be
replaced by NULL to indicate that it was non-constant.
Of course, even with constants passed upon input, it still may not be
possible to calculate a number of rows that can be returned - it could
be the case that the only parameter passed to cost function has been
converted to NULL because it is a column name. Perhaps in this case it
would be useful to specify returning NULL from my_func_row_cost means "I
can't return anything meaningful, so use the fallback values".
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not