My application has a function, call it "foo()", that requires initialization from a table
of about 800 values. Rather than build these values into the C code, it seemed like a good idea to
put them on a PG table and create a second function, call it "foo_init()", which is
called for each value, like this:
select foo_init(value) from foo_init_table order by value_id;
This works well, but it requires me to actually retrieve the function's value
800 times. So I thought I'd be clever:
select count(1) from (select foo_init(value) from foo_init_table order by
value_id) as foo;
And indeed, it count() returns 800, as expected. But my function foo_init()
never gets called! Apparently the optimizer figures out that foo_init() must
return one value for each row, so it doesn't bother to actually call the
function.
db=> explain select count(1) from (select foo_init(value) from foo_init_table
order by db_no) as foo;
query plan
----------------------------------------------------------------------------------------------------
aggregate (cost=69.95..69.95 rows=1 width=0)
-> Subquery Scan foo (cost=0.00..67.93 rows=806 width=0)
-> Index Scan using foo_init_table_pkey on foo_init_table
(cost=0.00..59.87 rows=806 width=30)
This doesn't seem right to me -- how can the optimizer possibly know that a
function doesn't have a side effect, as in my case? Functions could do all
sorts of things, such as logging activity, filling in other tables, etc, etc.
Am I missing something here?
Thanks,
Craig
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
http://archives.postgresql.org