Github user chiwanpark commented on the pull request:

    https://github.com/apache/flink/pull/907#issuecomment-122376238
  
    Hi, I am reviewing this changes. I'm not done yet but I found some points 
which are able to improve.
    
    First, there are some duplicated classes such as `SimpleFlatJoinFunction`, 
`MatchRemovingMatcher`, `Match`, `CollectionIterator`. I think that we can this 
classes move under `org.apache.flink.runtime.operators.testutils` package. 
After moving them, they can be shared with test cases for hash-based outer join.
    
    Second, this is just my opinion, how about creating iterator classes for 
each outer join type such as, `AbstractMergeLeftOuterJoinIterator`, 
`AbstractMergeRightOuterJoinIterator`, `AbstractMergeFullOuterJoinIterator` and 
derived classes by reusing variable? I'm concerned about time consuming by 
comparing outer join type for many records in `callWithNextKey` method. The 
outer join type is already decided before doing join operation. But I'm not 
sure that there is obvious performance decrease by this comparing. If the 
performance decrease is negligible, the second suggestion could be ignored.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to