[
https://issues.apache.org/jira/browse/CALCITE-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17986640#comment-17986640
]
Zhen Chen commented on CALCITE-7077:
------------------------------------
Hi [~julianhyde][~mbudiu]
I've revised the Jira description. I want to support non-equi conditions and
conditions that are not strongly null-propagating.
There are some distinctions between pushing down PROJECT operations and
rewriting FULL JOIN as LEFT JOIN + RIGHT JOIN.
In the rewritten form, both the LEFT JOIN and RIGHT JOIN still use the original
FULL JOIN's conditions. But we must carefully apply IS NULL filtering on the
JOIN results. The LEFT JOIN's complete result set is preserved. Only RIGHT JOIN
rows where the left table's all columns are NULL are added.
For the condition e.id = 100, which only includes the left column, add IS NULL
to the RIGHT JOIN side, as described in Jira.
For the condition d.id = 100, which only includes the right column, add IS NULL
to the LFET JOIN side.
Such as
{code:java}
SELECT *
FROM Employees e
LEFT JOIN Departments d ON e.id = d.id
where d.id is null
UNION ALL
SELECT *
FROM Employees e
RIGHT JOIN Departments d ON e.id = d.id;
{code}
For the condition that includes both columns, add IS NULL of the left column to
the RIGHT JOIN side. For example, if the condition is
{code:java}
e.id = d.id
e.id > d.id
e.id = 100 and d.id = 100
e.id = 100 or d.id = 100
{code}
then the rewritten form is
{code:java}
SELECT *
FROM Employees e
LEFT JOIN Departments d ON e.id = d.id
UNION ALL
SELECT *
FROM Employees e
RIGHT JOIN Departments d ON e.id = d.id
where e.id is null;
{code}
If the condition contains multiple columns, for example
{code:java}
e.id = e.id and e.deptno = d.deptno
{code}
We can only take the first column involved in the condition to construct the IS
NULL condition. Such as:
{code:java}
SELECT *
FROM Employees e
LEFT JOIN Departments d ON e.id = d.id and e.deptno = d.deptno
UNION ALL
SELECT *
FROM Employees e
RIGHT JOIN Departments d ON e.id = d.id and e.deptno = d.deptno
where e.id is null;
{code}
IS NOT DISTINCT FROM may not be supported. If there is a data with all columns
as NULL on the left side, it is impossible to determine whether it is
automatically supplemented by OUTER JOIN.
If there is anything I have not considered fully or is incorrect, please help
correct me, thank you!
> Implement a rule to rewrite FULL JOIN as LEFT JOIN and RIGHT JOIN
> -----------------------------------------------------------------
>
> Key: CALCITE-7077
> URL: https://issues.apache.org/jira/browse/CALCITE-7077
> Project: Calcite
> Issue Type: New Feature
> Reporter: Zhen Chen
> Assignee: Zhen Chen
> Priority: Minor
>
> Invalid unparse for FULL JOIN in MySQLDialect is described in CALCITE-7050.
> The purpose is to allow SQL dialects that do not support FULL JOIN to execute
> according to the semantics of FULL JOIN. For this, a rule can be implemented
> to rewrite FULL JOIN into LEFT JOIN and RIGHT JOIN. For example:
> original sql:
> {code:java}
> SELECT *
> FROM Employees e
> FULL JOIN Departments d ON e.id = d.id
> {code}
> rewrite into
> {code:java}
> SELECT *
> FROM Employees e
> LEFT JOIN Departments d ON e.id = d.id
> UNION ALL
> SELECT *
> FROM Employees e
> RIGHT JOIN Departments d ON e.id = d.id
> where e.id is null;
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)