[
https://issues.apache.org/jira/browse/DERBY-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786263#action_12786263
]
Dag H. Wanvik commented on DERBY-4405:
--------------------------------------
Trying to understand the code for the outer join reordering a.k.a.
linearization that Derby performs, cf.
HalfOuterJoinNode#LOJ_reorderable.
Is anyone familiar with what the rationale/intuition for this transformation
is? I did not find it
described in the Tuning guide
(http://db.apache.org/derby/docs/10.5/tuning/ctuntransform55045.html).
It does mention the simplification to inner joins, though.
This fairly recent paper which does address it [1] in much detail, but Derby's
logic is fairly limited..
[1] Galindo-Legaria, C. & Rosenthal, A.: "Outerjoin simplification and
reordering for query optimization", ACM Transactions on Database Systems, Vol
22, No 1, March 1997.
> Transformation to inner join not performed for certain three-way joins
> ----------------------------------------------------------------------
>
> Key: DERBY-4405
> URL: https://issues.apache.org/jira/browse/DERBY-4405
> Project: Derby
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 10.6.0.0
> Reporter: Knut Anders Hatlen
> Assignee: Dag H. Wanvik
> Priority: Minor
> Attachments: derby-4405.diff, derby-4405.stat
>
>
> In the CROSS JOIN section in the reference manual
> (http://db.apache.org/derby/docs/dev/ref/rrefsqljcrossjoin.html) there are
> three examples that are supposed to be equivalent. However, the performance
> differs significantly between the different queries.
> The queries use the tours db and look like this:
> (1)
> SELECT * FROM CITIES LEFT OUTER JOIN
> (FLIGHTS CROSS JOIN COUNTRIES)
> ON CITIES.AIRPORT = FLIGHTS.ORIG_AIRPORT
> WHERE COUNTRIES.COUNTRY_ISO_CODE = 'US'
> (2)
> SELECT * FROM CITIES LEFT OUTER JOIN
> FLIGHTS INNER JOIN COUNTRIES ON 1=1
> ON CITIES.AIRPORT = FLIGHTS.ORIG_AIRPORT
> WHERE COUNTRIES.COUNTRY_ISO_CODE = 'US'
> (3)
> SELECT * FROM CITIES LEFT OUTER JOIN
> (SELECT * FROM FLIGHTS, COUNTRIES) S
> ON CITIES.AIRPORT = S.ORIG_AIRPORT
> WHERE S.COUNTRY_ISO_CODE = 'US'
> When executed in ij, (1) and (2) need 6 seconds to complete, whereas (3)
> completes in 50 ms.
> The query plans for (1) and (2) use nested loop joins and table scans. (3)
> uses a combination of hash join and nested loop join, and index scans as well
> as table scans.
> It looks like (3) has been rewritten from a left outer join to an inner join
> internally. This is fine because all rows that have the right-side columns
> filled with NULLs will be filtered out by the predicate
> S.COUNTRY_ISO_CODE='US', so the extra rows generated by the outer join will
> not be returned.
> This optimization should also be possible for (1) and (2). We should improve
> the logic so that those joins are transformed too. The transformation happens
> in HalfOuterJoinNode.transformOuterJoins().
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.