[ 
https://issues.apache.org/jira/browse/FLINK-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905213#comment-14905213
 ] 

ASF GitHub Bot commented on FLINK-2576:
---------------------------------------

Github user fhueske commented on the pull request:

    https://github.com/apache/flink/pull/1138#issuecomment-142723198
  
    Hi @jkovacs, thanks for all your efforts to make the projection work. Going 
for a `GenericeTypeInfo` would work in many cases but unfortunately not in all. 
For example `union` operates in Flink on serialization level and requires that 
all data sets which are unioned use the same serializer. By transparently using 
a `GenericTypeInfo` users might be surprised why 
`DataSet<Tuple2<String,Long>.union(DataSet<Tuple2<String,Long>)` does not work. 
If we only support OuterJoins with an explicit JoinFunction, the user has full 
control how to deal with null values and can even use a custom Tuple type or 
Tuple serializer (via `Operator.returns()`) that supports null values. In my 
opinion, the best approach is to only support OuterJoins with JoinFunctions.


> Add outer joins to API and Optimizer
> ------------------------------------
>
>                 Key: FLINK-2576
>                 URL: https://issues.apache.org/jira/browse/FLINK-2576
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Java API, Optimizer, Scala API
>            Reporter: Ricky Pogalz
>            Priority: Minor
>             Fix For: pre-apache
>
>
> Add left/right/full outer join methods to the DataSet APIs (Java, Scala) and 
> to the optimizer of Flink.
> Initially, the execution strategy should be a sort-merge outer join 
> (FLINK-2105) but can later be extended to hash joins for left/right outer 
> joins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to