[HACKERS] Outer where pushed down
Hi all, consider this view: CREATE OR REPLACE VIEW v_current_connection AS SELECT ul.id_user FROM user_login ul, current_connection cc WHERE ul.id_user = cc.id_user; And this is the explain on a usage of that view: # explain select * from v_current_connection_test where sp_connected_test(id_user) = FALSE; QUERY PLAN -- Hash Join (cost=42.79..1325.14 rows=451 width=5) Hash Cond: (outer.id_user = inner.id_user) - Seq Scan on user_login ul (cost=0.00..1142.72 rows=27024 width=4) Filter: (sp_connected_test(id_user) = false) - Hash (cost=40.49..40.49 rows=919 width=5) - Index Scan using idx_connected on current_connection cc (cost=0.00..40.49 rows=919 width=5) Index Cond: (connected = true) (7 rows) apart my initial surprise to see that function applied at rows not returned by the view ( Tom Lane explained me that the planner is able to push down the outer condition ) why postgres doesn't apply that function at table current_connection given the fact are extimated only 919 vs 27024 rows? redefining the view: CREATE OR REPLACE VIEW v_current_connection AS SELECT cc.id_user FROM user_login ul, current_connection cc WHERE ul.id_user = cc.id_user; then I obtain the desidered plan. # explain select * from v_current_connection_test where sp_connected_test(id_user ) = FALSE; QUERY PLAN -- Hash Join (cost=46.23..1193.47 rows=452 width=5) Hash Cond: (outer.id_user = inner.id_user) - Seq Scan on user_login ul (cost=0.00..872.48 rows=54048 width=4) - Hash (cost=45.08..45.08 rows=460 width=5) - Index Scan using idx_connected on current_connection cc (cost=0.00..45.08 rows=460 width=5) Index Cond: (connected = true) Filter: (sp_connected_test(id_user) = false) (7 rows) Is not possible in any way push postgres to apply that function to the right table ? Shall I rewrite the views figuring out wich column is better to expose ? Regards Gaetano Mendola ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Outer where pushed down
Gaetano Mendola [EMAIL PROTECTED] writes: CREATE OR REPLACE VIEW v_current_connection AS SELECT ul.id_user FROM user_login ul, current_connection cc WHERE ul.id_user = cc.id_user; # explain select * from v_current_connection_test where sp_connected_test(id_user) = FALSE; why postgres doesn't apply that function at table current_connection given the fact are extimated only 919 vs 27024 rows? Because the condition is on a field of the other table. You seem to wish that the planner would use ul.id_user = cc.id_user to decide that sp_connected_test(ul.id_user) can be rewritten as sp_connected_test(cc.id_user), but in general this is not safe. The planner has little idea of what the datatype-specific semantics of equality are, and none whatsoever what the semantics of your function are. As a real-world example: IEEE-standard floating point math considers that +0 and -0 are different bit patterns. They compare as equal, but it's very easy to come up with user-defined functions that will yield different results for the two inputs. So the proposed transformation is definitely unsafe for float8. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Outer where pushed down
Tom Lane wrote: Gaetano Mendola [EMAIL PROTECTED] writes: CREATE OR REPLACE VIEW v_current_connection AS SELECT ul.id_user FROM user_login ul, current_connection cc WHERE ul.id_user = cc.id_user; # explain select * from v_current_connection_test where sp_connected_test(id_user) = FALSE; why postgres doesn't apply that function at table current_connection given the fact are extimated only 919 vs 27024 rows? Because the condition is on a field of the other table. You seem to wish that the planner would use ul.id_user = cc.id_user to decide that sp_connected_test(ul.id_user) can be rewritten as sp_connected_test(cc.id_user), but in general this is not safe. The planner has little idea of what the datatype-specific semantics of equality are, and none whatsoever what the semantics of your function are. As a real-world example: IEEE-standard floating point math considers that +0 and -0 are different bit patterns. They compare as equal, but it's very easy to come up with user-defined functions that will yield different results for the two inputs. So the proposed transformation is definitely unsafe for float8. And what about to define for each type when this is safe and let the planner make his best choice ? Rewriting that view the execution time passed from 4 secs to 1 sec, that is not bad if the planner can do it autonomously. In this very example I can decide if it's better expose one column or the other one but in other cases not... Regards Gaetano Mendola ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings