Github user chiwanpark commented on the pull request: https://github.com/apache/flink/pull/907#issuecomment-122376238 Hi, I am reviewing this changes. I'm not done yet but I found some points which are able to improve. First, there are some duplicated classes such as `SimpleFlatJoinFunction`, `MatchRemovingMatcher`, `Match`, `CollectionIterator`. I think that we can this classes move under `org.apache.flink.runtime.operators.testutils` package. After moving them, they can be shared with test cases for hash-based outer join. Second, this is just my opinion, how about creating iterator classes for each outer join type such as, `AbstractMergeLeftOuterJoinIterator`, `AbstractMergeRightOuterJoinIterator`, `AbstractMergeFullOuterJoinIterator` and derived classes by reusing variable? I'm concerned about time consuming by comparing outer join type for many records in `callWithNextKey` method. The outer join type is already decided before doing join operation. But I'm not sure that there is obvious performance decrease by this comparing. If the performance decrease is negligible, the second suggestion could be ignored.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---