[ 
https://issues.apache.org/jira/browse/FLINK-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14631756#comment-14631756
 ] 

ASF GitHub Bot commented on FLINK-2105:
---------------------------------------

Github user chiwanpark commented on the pull request:

    https://github.com/apache/flink/pull/907#issuecomment-122376238
  
    Hi, I am reviewing this changes. I'm not done yet but I found some points 
which are able to improve.
    
    First, there are some duplicated classes such as `SimpleFlatJoinFunction`, 
`MatchRemovingMatcher`, `Match`, `CollectionIterator`. I think that we can this 
classes move under `org.apache.flink.runtime.operators.testutils` package. 
After moving them, they can be shared with test cases for hash-based outer join.
    
    Second, this is just my opinion, how about creating iterator classes for 
each outer join type such as, `AbstractMergeLeftOuterJoinIterator`, 
`AbstractMergeRightOuterJoinIterator`, `AbstractMergeFullOuterJoinIterator` and 
derived classes by reusing variable? I'm concerned about time consuming by 
comparing outer join type for many records in `callWithNextKey` method. The 
outer join type is already decided before doing join operation. But I'm not 
sure that there is obvious performance decrease by this comparing. If the 
performance decrease is negligible, the second suggestion could be ignored.


> Implement Sort-Merge Outer Join algorithm
> -----------------------------------------
>
>                 Key: FLINK-2105
>                 URL: https://issues.apache.org/jira/browse/FLINK-2105
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Local Runtime
>            Reporter: Fabian Hueske
>            Assignee: Ricky Pogalz
>            Priority: Minor
>             Fix For: pre-apache
>
>
> Flink does not natively support outer joins at the moment. 
> This issue proposes to implement a sort-merge outer join algorithm that can 
> cover left, right, and full outer joins.
> The implementation can be based on the regular sort-merge join iterator 
> ({{ReusingMergeMatchIterator}} and {{NonReusingMergeMatchIterator}}, see also 
> {{MatchDriver}} class)
> The Reusing and NonReusing variants differ in whether object instances are 
> reused or new objects are created. I would start with the NonReusing variant 
> which is safer from a user's point of view and should also be easier to 
> implement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to