Re: [PERFORM] immutable functions vs. join for lookups ?
On 4/21/05, Enrico Weigelt <[EMAIL PROTECTED]> wrote: > > > Even if your data never changes it *can* change so the function should > > be at most stable not immutable. > > okay, the planner sees that the table could potentionally change. > but - as the dba - I'd like to tell him, this table *never* changes > in practise (or at most there will be an insert once a year) > > isnt there any way to enforce the function to be really immutable ? Never say never. :) And to answer your question -- your IMMUTABLE function may reference other functions (even VOLATILE). So you may create a "caller" immutable function which just calls your non-immutable function. But from performance standpoint there is not much difference (probably your STABLE function will be faster than STABLE inside IMMUTABLE function). Ah, and please note that some time in future PostgreSQL may require that IMMUTABLE function calls only IMMUTABLE functions. Regards, Dawid ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] immutable functions vs. join for lookups ?
On 4/21/05, Enrico Weigelt <[EMAIL PROTECTED]> wrote: > * Tom Lane <[EMAIL PROTECTED]> wrote: > > > > Yeah, I was actually thinking about a two-step process: inline the > > function to produce somethig equivalent to a handwritten scalar > > sub-SELECT, and then try to convert sub-SELECTs into joins. > > ... back to my original question ... > > What kind of query should I use ? > Is a join better than a function ? It appears that JOINs are usually faster. So if performance is an important issue, go with JOIN (and VIEWs probably). Functions are nicer (in terms off look and feel). Regards, Dawid ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] immutable functions vs. join for lookups ?
* Tom Lane <[EMAIL PROTECTED]> wrote: > Yeah, I was actually thinking about a two-step process: inline the > function to produce somethig equivalent to a handwritten scalar > sub-SELECT, and then try to convert sub-SELECTs into joins. ... back to my original question ... What kind of query should I use ? Is a join better than a function ? cu -- - Enrico Weigelt== metux IT service phone: +49 36207 519931 www: http://www.metux.de/ fax: +49 36207 519932 email: [EMAIL PROTECTED] cellphone: +49 174 7066481 - -- DSL ab 0 Euro. -- statische IP -- UUCP -- Hosting -- Webshops -- - ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] immutable functions vs. join for lookups ?
* Jaime Casanova <[EMAIL PROTECTED]> wrote: > Even if your data never changes it *can* change so the function should > be at most stable not immutable. okay, the planner sees that the table could potentionally change. but - as the dba - I'd like to tell him, this table *never* changes in practise (or at most there will be an insert once a year) isnt there any way to enforce the function to be really immutable ? cu -- - Enrico Weigelt== metux IT service phone: +49 36207 519931 www: http://www.metux.de/ fax: +49 36207 519932 email: [EMAIL PROTECTED] cellphone: +49 174 7066481 - -- DSL ab 0 Euro. -- statische IP -- UUCP -- Hosting -- Webshops -- - ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] immutable functions vs. join for lookups ?
Yay! You're right! I wonder why have I forgotten about LANGUAGE SQL. :) It's 30 seconds vs 5 seconds for CASE ... END insisde PLpgsql vs CASE...END LANGUAGE SQL. :) I.e. its almost the same as in-place entered SQL. Regards, Dawid ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] immutable functions vs. join for lookups ?
Yay! You're right! I wonder why have I forgotten about LANGUAGE SQL. :) It's 30 seconds vs 5 seconds for CASE ... END insisde PLpgsql vs CASE...END LANGUAGE SQL. :) I.e. its almost the same as in-place entered SQL. Probably because simple SQL functions get inlined by the optimiser. Chris ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] immutable functions vs. join for lookups ?
On 4/20/05, Jim C. Nasby <[EMAIL PROTECTED]> wrote: > You should re-run the function test using SQL as the function language > instead of plpgsql. There might be some performance to be had there. Yay! You're right! I wonder why have I forgotten about LANGUAGE SQL. :) It's 30 seconds vs 5 seconds for CASE ... END insisde PLpgsql vs CASE...END LANGUAGE SQL. :) I.e. its almost the same as in-place entered SQL. Regards, Dawid ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] immutable functions vs. join for lookups ?
You should re-run the function test using SQL as the function language instead of plpgsql. There might be some performance to be had there. -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?" ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] immutable functions vs. join for lookups ?
Greg Stark <[EMAIL PROTECTED]> writes: > Tom Lane <[EMAIL PROTECTED]> writes: >> It would be interesting sometime to try to teach the planner about >> inlining SQL-language functions to become joins. That is, given > The Inlining of the function is presumably a side-issue. I have tons of > queries that use subqueries in the select list for which the same behaviour > would be appropriate. Yeah, I was actually thinking about a two-step process: inline the function to produce somethig equivalent to a handwritten scalar sub-SELECT, and then try to convert sub-SELECTs into joins. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] immutable functions vs. join for lookups ?
Tom Lane <[EMAIL PROTECTED]> writes: > It would be interesting sometime to try to teach the planner about > inlining SQL-language functions to become joins. That is, given > > create function id2name(int) returns text as > 'select name from mytab where id = $1' language sql stable; > > select uid, id2name(uid) from othertab where something; > > I think that in principle this could automatically be converted to > > select uid, name from othertab left join mytab on (uid = id) where something; The Inlining of the function is presumably a side-issue. I have tons of queries that use subqueries in the select list for which the same behaviour would be appropriate. Things like select uid, (select name from mytab where id = uid) as name from othertab ... > There are some pitfalls though, particularly that you'd have to be able to > prove that the function's query couldn't return more than one row (else the > join might produce more result rows than the original query). Or just have a special join type that has the desired behaviour in that case. Ie, pretend the query was really SELECT * FROM othertab LEFT SINGLE JOIN mytab ... Where "LEFT SINGLE JOIN" is an imaginary syntax that doesn't actually have to exist in the parser, but exists in the planner/executor and behaves differently in the case of duplicate matches. Actually I could see such a syntax being useful directly too. -- greg ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] immutable functions vs. join for lookups ?
"Merlin Moncure" <[EMAIL PROTECTED]> writes: >> d) self-join with a function ;) >> EXPLAIN ANALYZE SELECT * FROM (SELECT n, id2username(n) AS username >> FROM (SELECT DISTINCT n FROM aaa) AS values) AS v_lookup RIGHT JOIN >> aaa USING (n); > That's pretty clever. > It sure seems like the server was not caching the results of the > function...maybe the server thought it was to small a table to bother? No, it probably flattened the subquery on sight (looking at the actual EXPLAIN output would confirm or disprove that). You could prevent the flattening by adding OFFSET 0 in the subquery. However, the SELECT DISTINCT sub-sub-query is expensive enough, and the join itself is expensive enough, that you would need an *enormously* expensive id2username() function to make this a win. It would be interesting sometime to try to teach the planner about inlining SQL-language functions to become joins. That is, given create function id2name(int) returns text as 'select name from mytab where id = $1' language sql stable; select uid, id2name(uid) from othertab where something; I think that in principle this could automatically be converted to select uid, name from othertab left join mytab on (uid = id) where something; which is much more amenable to join optimization. There are some pitfalls though, particularly that you'd have to be able to prove that the function's query couldn't return more than one row (else the join might produce more result rows than the original query). regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] immutable functions vs. join for lookups ?
On 4/18/05, Merlin Moncure <[EMAIL PROTECTED]> wrote: > > d) self-join with a function ;) > > EXPLAIN ANALYZE SELECT * FROM (SELECT n, id2username(n) AS username > > FROM (SELECT DISTINCT n FROM aaa) AS values) AS v_lookup RIGHT JOIN > > aaa USING (n); > > That's pretty clever. > It sure seems like the server was not caching the results of the > function...maybe the server thought it was to small a table to bother? Nah, I don't thinks so. Having around 2 097 152 rows of 1s and 0s takes 48 seconds for id2username() query. The "self join" you've quoted above takes 32 seconds. SELECT n FROM aaa; takes 7 seconds. Thinking further... SELECT CASE n WHEN 0 THEN 'foo' WHEN 1 THEN 'bar' END FROM aaa; takes 9 seconds. CREATE OR REPLACE FUNCTION id2un_case(oid int) RETURNS text AS $$ BEGIN RETURN CASE oid WHEN 0 THEN 'foo' WHEN 1 THEN 'bar' END; END; $$ LANGUAGE plpgsql IMMUTABLE; SELECT id2un_case(n) FROM aaa; ...takes 36 seconds ...and to see how it depends on flags used: SELECT count(id2un_case(n)) FROM aaa; ...id2un_case(n) IMMUTABLE takes 29900,114 ms ...id2un_case(n) IMMUTABLE STRICT takes 30187,958 ms ...id2un_case(n) STABLE takes 31457,560 ms ...id2un_case(n) takes 33545,178 ms ...id2un_case(n) VOLATILE takes 35150,920 ms (and a count(CASE n WHEN ... END) FROM aaa takes: 2564,188 ms I understand that these measurements are not too accurate. They were done on idle system, and the queries were run couple of times (to make sure they're cached :)). I believe either something is minor performance difference between IMMUTABLE STABLE and even VOLATILE plpgsql... :( Oh, and doing things like "ORDER BY n" or "WHERE n = 1" didn't help either... I still wonder whether it's only my case or is there really something wrong with these functions? Regards, Dawid ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] immutable functions vs. join for lookups ?
> d) self-join with a function ;) > EXPLAIN ANALYZE SELECT * FROM (SELECT n, id2username(n) AS username > FROM (SELECT DISTINCT n FROM aaa) AS values) AS v_lookup RIGHT JOIN > aaa USING (n); That's pretty clever. It sure seems like the server was not caching the results of the function...maybe the server thought it was to small a table to bother? Merlin ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] immutable functions vs. join for lookups ?
On 4/15/05, Enrico Weigelt <[EMAIL PROTECTED]> wrote: > a) SELECT items.a, items.b, ..., users.username FROM items, users > WHERE items.uid = users.uid; > > c) CREATE FUNCTION id2username(oid) RETURNS text > LANGUAGE 'SQL' IMMUTABLE AS ' > SELECT username AS RESULT FROM users WHERE uid = $1'; You will be told that this function is not immutable but stable, and this is quite right. But consider such a function: CREATE OR REPLACE FUNCTION id2username (oid int) RETURNS TEXT AS $$ BEGIN IF oid = 0 THEN RETURN 'foo'; ELSIF oid = 1 THEN RETURN 'bar'; END IF; END; $$ LANGUAGE plpgsql IMMUTABLE; versus a lookup table with similar data. Logic suggests it should be faster than a table... It got me worried when I added: "RAISE WARNING 'Called'" after begin and I got lots of "Called" warnings when using this IMMUTABLE function in select... And the timings for ~6000 values in aaa table (and two values in lookup table) are: There is a query, output of the EXPLAIN ANALYZE, Time of EXPLAIN ANALYZE and "Real time" of SELECT (without EXPLAIN ANALYZE): a) simple select from temp table, and a lookup cost: EXPLAIN ANALYZE SELECT n FROM aaa; Seq Scan on aaa (cost=0.00..87.92 rows=5992 width=4) (actual time=0.011..24.849 rows=6144 loops=1) Total runtime: 51.881 ms (2 rows) Time: 52,882 ms Real time: 16,261 ms EXPLAIN ANALYZE SELECT id2username(n) FROM aaa limit 2; Limit (cost=0.00..0.03 rows=2 width=4) (actual time=0.111..0.150 rows=2 loops=1) -> Seq Scan on aaa (cost=0.00..104.80 rows=6144 width=4) (actual time=0.102..0.129 rows=2 loops=1) Total runtime: 0.224 ms (3 rows) Time: 1,308 ms Real time: 1,380 ms b) natural join with lookup table: EXPLAIN ANALYZE SELECT username FROM aaa NATURAL JOIN lookup; Hash Join (cost=2.45..155.09 rows=3476 width=32) (actual time=0.306..83.677 rows=6144 loops=1) Hash Cond: ("outer".n = "inner".n) -> Seq Scan on aaa (cost=0.00..87.92 rows=5992 width=4) (actual time=0.006..25.517 rows=6144 loops=1) -> Hash (cost=2.16..2.16 rows=116 width=36) (actual time=0.237..0.237 rows=0 loops=1) -> Seq Scan on lookup (cost=0.00..2.16 rows=116 width=36) (actual time=0.016..0.034 rows=2 loops=1) Total runtime: 107.378 ms (6 rows) Time: 109,040 ms Real time: 25,364 ms c) IMMUTABLE "static" lookup function: EXPLAIN ANALYZE SELECT id2username(n) FROM aaa; Seq Scan on aaa (cost=0.00..104.80 rows=6144 width=4) (actual time=0.089..116.397 rows=6144 loops=1) Total runtime: 143.800 ms (2 rows) Time: 144,869 ms Real time: 102,428 ms d) self-join with a function ;) EXPLAIN ANALYZE SELECT * FROM (SELECT n, id2username(n) AS username FROM (SELECT DISTINCT n FROM aaa) AS values) AS v_lookup RIGHT JOIN aaa USING (n); Hash Left Join (cost=506.82..688.42 rows=6144 width=36) (actual time=102.382..182.661 rows=6144 loops=1) Hash Cond: ("outer".n = "inner".n) -> Seq Scan on aaa (cost=0.00..89.44 rows=6144 width=4) (actual time=0.012..24.360 rows=6144 loops=1) -> Hash (cost=506.82..506.82 rows=2 width=36) (actual time=102.217..102.217 rows=0 loops=1) -> Subquery Scan v_lookup (cost=476.05..506.82 rows=2 width=36) (actual time=53.626..102.057 rows=2 loops=1) -> Subquery Scan "values" (cost=476.05..506.80 rows=2 width=4) (actual time=53.613..102.023 rows=2 loops=1) -> Unique (cost=476.05..506.77 rows=2 width=4) (actual time=53.456..101.772 rows=2 loops=1) -> Sort (cost=476.05..491.41 rows=6144 width=4) (actual time=53.440..76.710 rows=6144 loops=1) Sort Key: n -> Seq Scan on aaa (cost=0.00..89.44 rows=6144 width=4) (actual time=0.013..26.626 rows=6144 loops=1) Total runtime: 209.378 ms (11 rows) Time: 211,460 ms Real time: 46,682 ms ...so this IMMUTABLE is twice as slow (~100 ms) as the query joining itself with a SELECT DISTINCT on an IMMUTABLE function (~50 ms), which is twice as slow as JOIN against lookup table (~25 ms), and I feel this IMMUTABLE function could be around ~20 ms (~16 ms plus calling the function two times plus giving the values). Ah, and this is PostgreSQL 8.0.1 running under FreeBSD on a CPU: Intel(R) Celeron(R) CPU 2.40GHz (2400.10-MHz 686-class CPU). Regards, Dawid PS: I have a feeling that IMMUTABLE functions worked better in 7.4, yet I am unable to confirm this. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] immutable functions vs. join for lookups ?
On 4/17/05, Enrico Weigelt <[EMAIL PROTECTED]> wrote: > * Tom Lane <[EMAIL PROTECTED]> wrote: > > Enrico Weigelt <[EMAIL PROTECTED]> writes: > > > c) CREATE FUNCTION id2username(oid) RETURNS text > > > LANGUAGE 'SQL' IMMUTABLE AS ' > > > SELECT username AS RESULT FROM users WHERE uid = $1'; > > > > This is simply dangerous. The function is *NOT* immutable (it is > > stable though). When ... not if ... your application breaks because > > you got the wrong answers, you'll get no sympathy from anyone. > > In my case it is immutable. The username never changes. > Even if your data never changes it *can* change so the function should be at most stable not immutable. regards, Jaime Casanova ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] immutable functions vs. join for lookups ?
* Tom Lane <[EMAIL PROTECTED]> wrote: > Enrico Weigelt <[EMAIL PROTECTED]> writes: > > c) CREATE FUNCTION id2username(oid) RETURNS text > > LANGUAGE 'SQL' IMMUTABLE AS ' > > SELECT username AS RESULT FROM users WHERE uid = $1'; > > This is simply dangerous. The function is *NOT* immutable (it is > stable though). When ... not if ... your application breaks because > you got the wrong answers, you'll get no sympathy from anyone. In my case it is immutable. The username never changes. cu -- - Enrico Weigelt== metux IT service phone: +49 36207 519931 www: http://www.metux.de/ fax: +49 36207 519932 email: [EMAIL PROTECTED] cellphone: +49 174 7066481 - -- DSL ab 0 Euro. -- statische IP -- UUCP -- Hosting -- Webshops -- - ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] immutable functions vs. join for lookups ?
Enrico Weigelt <[EMAIL PROTECTED]> writes: > c) CREATE FUNCTION id2username(oid) RETURNS text > LANGUAGE 'SQL' IMMUTABLE AS ' > SELECT username AS RESULT FROM users WHERE uid = $1'; This is simply dangerous. The function is *NOT* immutable (it is stable though). When ... not if ... your application breaks because you got the wrong answers, you'll get no sympathy from anyone. The correct question to ask was "if I make a stable function like this, is it likely to be faster than the join?". The answer is "probably not; at best it will be equal to the join". The best the planner is likely to be able to do with the function-based query is equivalent to a nestloop with inner indexscan (assuming there is an index on users.uid). If that's the best plan then the join case should find it too ... but if you are selecting a lot of items rows then it won't be the best plan. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] immutable functions vs. join for lookups ?
Hi folks, I like to use (immutable) functions for looking up serveral (almost constant) things, i.e fetching a username by id. This makes my queries more clear. But is this really performant ? Lets imagine: We've got an table with user accounts (uid,name,...). Then we've got another one which contains some items assigned to users, and so are linked to them by an uid field. Now want to view the items with usernames instead of just uid: a) SELECT items.a, items.b, ..., users.username FROM items, users WHERE items.uid = users.uid; c) CREATE FUNCTION id2username(oid) RETURNS text LANGUAGE 'SQL' IMMUTABLE AS ' SELECT username AS RESULT FROM users WHERE uid = $1'; SELECT items.a, items.b, ..., id2username(users.uid); Which one is faster with a) only a few users (<50) b) many users ( >1k ) while we have several 10k of items ? thx -- - Enrico Weigelt== metux IT service phone: +49 36207 519931 www: http://www.metux.de/ fax: +49 36207 519932 email: [EMAIL PROTECTED] cellphone: +49 174 7066481 - -- DSL ab 0 Euro. -- statische IP -- UUCP -- Hosting -- Webshops -- - ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])