On Thursday, 27 April, 2017 17:31, Joseph L. Casale <jcas...@activenetwerx.com> 
inquired:

> Hey guys,
> So what are the semantics of the predicate on the join versus the where
> clause?

> Just curious...

When the joins are only equijoins (inner joins) of any type:

SELECT <columns>
FROM <table1>
JOIN <table2> ON <condition2>
JOIN <table3> ON <condition3>
WHERE <condition1>

is syntactic sugar for:

SELECT <columns>
FROM <table1>, <table2>, <table3>
WHERE (<condition1>)
AND (<condition2>)
AND (<condition3>)

and the latter case is the one that is always processed.  The processing is as 
if there were a cross join of all the tables done and then all the conditions 
applied to select the returned rows from the selection.  In actual fact the 
query optimizer decides which order to process the table loops and were to 
apply the conditions to generate the result most efficiently.  In other words, 
for normal equijoins, the ON clause is just a where clause and there is no 
special significance attached.

You could also say:

SELECT <columns>
FROM <table1>
JOIN <table2> ON (<condition1>) AND (<condition2>) AND (<condition3>)
JOIN <table3>

or even think of it as (though this is an error)

SELECT <columns>
FROM <table1> ON <condition1>
JOIN <table2> ON <condition2>
JOIN <table3> ON <condition3>

and it would have the same effect.

However, if you use "CROSS JOIN" then you are prohibiting the query planner 
from re-ordering the nested-loop order of the two tables mentioned to the left 
and right of the CROSS JOIN keywords.  The processing of the ON/WHERE clauses 
is, however, identical to an normal inner join.  CROSS JOIN merely specifies a 
specific visitation order for the tables to its left and right (outer and inner 
respectively).

However, in the case of a LEFT or OUTER join, the ON conditions attached to the 
table on the rhs of the join is "glued" to the inner loop scanning that table, 
and any tables in mentioned in that ON clause must be visited before the OUTER 
JOIN can be processed and ONLY the conditions in that ON clause are used in the 
inner loop of the rhs table.  Otherwise all equijoin ON clauses become WHERE 
clause filters and the query optimizer is free to re-order and optimize how 
they are visited and how the filtering is applied.

So, the processing order is 
(1) "glue" any ON conditions that belong to an OUTER JOINed rhs table to that 
tables' inner loop, push all other ON conditions down into the WHERE clause.
(2) optimize the visitation order of the tables and apply applicable WHERE 
conditions into the outer loops.  Only the specified conditions in the ON 
clause of an outer  join are used to select candidates from the inner loop of 
an outer join (other conditions from the WHERE clause cannot be applied)
(3) apply any leftover WHERE conditions
(4) apply any GROUP BY (which may be optimized up to step 2)
(5) apply any HAVING conditions


** All tables mentioned in the ON clauses glued to an OUTER join rhs table must 
be visited before that outer join can be visited.

See https://sqlite.org/queryplanner.html
And https://sqlite.org/optoverview.html
And https://sqlite.org/queryplanner-ng.html




_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to