The larger point is that writing an estimator for an SRF is frequently a task about as difficult as writing the SRF itself
True, although I think this doesn't necessarily kill the idea. If writing an estimator for a given SRF is too difficult, the user is no worse off than they are today. Hopefully there would be a fairly large class of SRFs for which writing an estimator would be relatively simple, and result in improved planner behavior.
I don't foresee a whole lot of use of an estimator hook designed as proposed here. In particular, if the API is such that we can only use the estimator when all the function arguments are plan-time constants, it's not going to be very helpful.
Yes :( One approach might be to break the function's domain into pieces and have the estimator function calculate the estimated result set size for each piece. So, given a trivial function like:
foo(int): if $1 < 10 then produce 100 rows else produce 10000 rows
If the planner has encoded the distribution of input tuples to the function as a histogram, it could invoke the SRF's estimator function for the boundary values of each histogram bucket, and use that to get an idea of the function's likely result set size at runtime.
And yes, the idea as sketched is totally unworkable :) For one thing, the difficulty of doing this grows rapidly as the number of arguments to the function increases. But perhaps there is some variant of this idea that might work...
Another thought is that the estimator could provide information on the cost of evaluating the function, the number of tuples produced by the function, and even the distribution of those tuples.
BTW, why is this on -performance? It should be on -hackers.
---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend