Hello, Reading the manual recently I came across this: ( http://www.postgresql.org/docs/8.3/interactive/xfunc-volatility.html ) > Because of the snapshotting behavior of MVCC (see Chapter > 13<http://www.postgresql.org/docs/8.3/interactive/mvcc.html>) a function containing only SELECT commands can safely be marked > STABLE, even if it selects from tables that might be undergoing modifications by concurrent queries. PostgreSQL will execute a STABLE > function using the snapshot established for the calling query, and so it will see a fixed view of the database throughout that query. Also > note that the current_timestamp family of functions qualify as stable, since their values do not change within a transaction.
It stroke me that it might be not all that safe to mark SELECTing only function STABLE vs VOLATILE (or vice versa). Consider an example: create table t1(id int); create or replace function f1() returns void as $$ declare i int; begin select count(*) into i from t1; raise notice '%', i; -- waste some time for i in 1..700000000 loop end loop; select count(*) into i from t1; raise notice '%', i; end; $$ language 'plpgsql'; Now in first connection do: select f1(); While the execution is in the loop which takes a while do in another connection: insert into t1 values (1); The function returns with the following notices: NOTICE: 0 NOTICE: 1 Should I change the volatility type of f1() to STABLE and run the above again I would get: NOTICE: 1 NOTICE: 1 It looks like at least plpgsql functions use most recent snapshot on each call to SPI manager instead that of a calling query, so since default transaction isolation level in postgres is READ COMMITTED concurrent transactions may affect result of pure-reader VOLATILE function. I wonder if any-language (including SQL,and C) function would behave in the same way? Another thing I've recently discover is that SQL function seem to be unexpectedly slow to call. Example: create or replace function f2sql(int) returns int as $$ select case when $1 < 100000 then 1 when 100000 <= $1 and $1 < 500000 then 2 when $1 >= 500000 then 3 end; $$ language 'sql' immutable; create or replace function f2plpgsql(int) returns int as $$ begin return case when $1 < 100000 then 1 when 100000 <= $1 and $1 < 500000 then 2 when $1 >= 500000 then 3 end; end; $$ language 'plpgsql' immutable; These two function do exactly the same calculation on input and differ only in language used. Now I write some query involving them and wrap it into another function (so that I could use PERFORM to avoid possible overhead on fetching results to the client, to cache the plan and to measure the time in more precise manner): create or replace function f3() returns void as $$ declare st timestamp; begin st := clock_timestamp(); perform f2sql(trunc(1000000*random())::int) + f2sql(trunc(1000000*random())::int) + f2sql(trunc(1000000*random())::int) + f2sql(trunc(1000000*random())::int) + f2sql(trunc(1000000*random())::int) + f2sql(trunc(1000000*random())::int) + f2sql(trunc(1000000*random())::int) + f2sql(trunc(1000000*random())::int) + f2sql(trunc(1000000*random())::int) + f2sql(trunc(1000000*random())::int) from generate_series(1, 100000); raise notice '%', clock_timestamp() - st; end; $$ language 'plpgsql' ; create or replace function f4() returns void as $$ declare st timestamp; begin st := clock_timestamp(); perform f2plpgsql(trunc(1000000*random())::int) + f2plpgsql(trunc(1000000*random())::int) + f2plpgsql(trunc(1000000*random())::int) + f2plpgsql(trunc(1000000*random())::int) + f2plpgsql(trunc(1000000*random())::int) + f2plpgsql(trunc(1000000*random())::int) + f2plpgsql(trunc(1000000*random())::int) + f2plpgsql(trunc(1000000*random())::int) + f2plpgsql(trunc(1000000*random())::int) + f2plpgsql(trunc(1000000*random())::int) from generate_series(1, 100000); raise notice '%', clock_timestamp() - st; end; $$ language 'plpgsql' ; Now f4() reports 4.2 sec of runtime on average while f3() - 10.3 sec, that is a notable difference especially considering that SQL function is likely to be inlined. Do i miss something?