Rui Wang created BEAM-7151:
------------------------------

             Summary: Support conjunction clause when it's only equi-join
                 Key: BEAM-7151
                 URL: https://issues.apache.org/jira/browse/BEAM-7151
             Project: Beam
          Issue Type: Sub-task
          Components: dsl-sql
            Reporter: Rui Wang


conjunction_clause: function_call(function_parameter, ...) | field_access | 
column
function_parameter: function_call | field_access

In Beam, equi-join is implemented by CoGBK, which requires both join inputs 
(assume binary join) to build PCollection of KV<Row, Row>, where the key is 
join key.

For equi-join, conjunction clause is essentially an equation. In order to build 
KV<Row, Row>, it requires that columns from different sides of equation should 
come from different join input. For example, a + b = 2 cannot be used to build 
join key but a = 2 - b can. So rewriting is required for clauses when it does 
not satisfy this property. 

It also implies that not every clause is rewritable. Say the clause is f(a, b) 
= 3, in which a is from left input and b is from right input. If this function 
f is not splittable, such that we cannot move a or b to right side of equation, 
then we cannot support this clause in BeamSQL's  join.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to