Rui Wang created BEAM-7151:
------------------------------
Summary: Support conjunction clause when it's only equi-join
Key: BEAM-7151
URL: https://issues.apache.org/jira/browse/BEAM-7151
Project: Beam
Issue Type: Sub-task
Components: dsl-sql
Reporter: Rui Wang
conjunction_clause: function_call(function_parameter, ...) | field_access |
column
function_parameter: function_call | field_access
In Beam, equi-join is implemented by CoGBK, which requires both join inputs
(assume binary join) to build PCollection of KV<Row, Row>, where the key is
join key.
For equi-join, conjunction clause is essentially an equation. In order to build
KV<Row, Row>, it requires that columns from different sides of equation should
come from different join input. For example, a + b = 2 cannot be used to build
join key but a = 2 - b can. So rewriting is required for clauses when it does
not satisfy this property.
It also implies that not every clause is rewritable. Say the clause is f(a, b)
= 3, in which a is from left input and b is from right input. If this function
f is not splittable, such that we cannot move a or b to right side of equation,
then we cannot support this clause in BeamSQL's join.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)