[
https://issues.apache.org/jira/browse/FLINK-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905213#comment-14905213
]
ASF GitHub Bot commented on FLINK-2576:
---------------------------------------
Github user fhueske commented on the pull request:
https://github.com/apache/flink/pull/1138#issuecomment-142723198
Hi @jkovacs, thanks for all your efforts to make the projection work. Going
for a `GenericeTypeInfo` would work in many cases but unfortunately not in all.
For example `union` operates in Flink on serialization level and requires that
all data sets which are unioned use the same serializer. By transparently using
a `GenericTypeInfo` users might be surprised why
`DataSet<Tuple2<String,Long>.union(DataSet<Tuple2<String,Long>)` does not work.
If we only support OuterJoins with an explicit JoinFunction, the user has full
control how to deal with null values and can even use a custom Tuple type or
Tuple serializer (via `Operator.returns()`) that supports null values. In my
opinion, the best approach is to only support OuterJoins with JoinFunctions.
> Add outer joins to API and Optimizer
> ------------------------------------
>
> Key: FLINK-2576
> URL: https://issues.apache.org/jira/browse/FLINK-2576
> Project: Flink
> Issue Type: Sub-task
> Components: Java API, Optimizer, Scala API
> Reporter: Ricky Pogalz
> Priority: Minor
> Fix For: pre-apache
>
>
> Add left/right/full outer join methods to the DataSet APIs (Java, Scala) and
> to the optimizer of Flink.
> Initially, the execution strategy should be a sort-merge outer join
> (FLINK-2105) but can later be extended to hash joins for left/right outer
> joins.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)