[ 
https://issues.apache.org/jira/browse/FLINK-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297806#comment-15297806
 ] 

ASF GitHub Bot commented on FLINK-3941:
---------------------------------------

Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/2025#discussion_r64338066
  
    --- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/nodes/dataset/DataSetUnion.scala
 ---
    @@ -69,7 +73,7 @@ class DataSetUnion(
           rows + metadata.getRowCount(child)
         }
     
    -    planner.getCostFactory.makeCost(rowCnt, 0, 0)
    +    planner.getCostFactory.makeCost(if (all) rowCnt else rowCnt * 0.1, 0, 
0)
    --- End diff --
    
    The cost for union should be higher than for union all. Also, `rowCnt` is 
the number of rows processed, not the result size.
    How about this?
    ```
    planner.getCostFactory.makeCost(
      rowCnt, 
      if (all) 0 else, rowCnt,
      if (all) 0 else rowCnt)
    ```


> Add support for UNION (with duplicate elimination)
> --------------------------------------------------
>
>                 Key: FLINK-3941
>                 URL: https://issues.apache.org/jira/browse/FLINK-3941
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table API
>    Affects Versions: 1.1.0
>            Reporter: Fabian Hueske
>            Assignee: Yijie Shen
>            Priority: Minor
>
> Currently, only UNION ALL is supported by Table API and SQL.
> UNION (with duplicate elimination) can be supported by applying a 
> {{DataSet.distinct()}} after the union on all fields. This issue includes:
> - Extending {{DataSetUnion}}
> - Relaxing {{DataSetUnionRule}} to translated non-all unions.
> - Extend the Table API with union() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to