[ 
https://issues.apache.org/jira/browse/FLINK-5750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16536811#comment-16536811
 ] 

ASF GitHub Bot commented on FLINK-5750:
---------------------------------------

GitHub user AlexanderKoltsov opened a pull request:

    https://github.com/apache/flink/pull/6287

    [FLINK-5750] Incorrect translation of n-ary Union

    ## What is the purpose of the change
    
    *This pull request adds supporting multiple inputs in DataSetUnionRule*
    
    
    ## Brief change log
    
      - *DataSetUnionRule should consider all inputs instead of only the 1st 
and 2nd*
    
    
    ## Verifying this change
    
    *This change added the following test:*
    *- Added unit test testValuesWithCast that validates VALUES operator with 
values which have to to be casted. This query will be transform to UNION of 
VALUES in plan optimizer since values arguments are not literal value*
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): (yes / **no**)
      - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
      - The serializers: (yes / **no** / don't know)
      - The runtime per-record code paths (performance sensitive): (yes / 
**no** / don't know)
      - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
      - The S3 file system connector: (yes / **no** / don't know)
    
    ## Documentation
    
      - Does this pull request introduce a new feature? (yes / **no**)
      - If yes, how is the feature documented? (not applicable / docs / 
JavaDocs / not documented)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/AlexanderKoltsov/flink master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/6287.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #6287
    
----
commit 534ad0227a060bddc0fdde50b2dd397b4f000916
Author: Alexander Koltsov <alexander_koltsov@...>
Date:   2018-07-09T11:35:08Z

    [FLINK-5750] Incorrect translation of n-ary Union
    
    Calcite's union operator is supports more than two input relations.
    However, Flink's translation rules only consider the first two relations
    because we assumed that Calcite's union is binary.
    This problem exists for batch and streaming queries.

----


> Incorrect translation of n-ary Union
> ------------------------------------
>
>                 Key: FLINK-5750
>                 URL: https://issues.apache.org/jira/browse/FLINK-5750
>             Project: Flink
>          Issue Type: Bug
>          Components: Table API &amp; SQL
>    Affects Versions: 1.2.0, 1.3.4, 1.5.0, 1.4.2, 1.6.0
>            Reporter: Anton Mushin
>            Assignee: Alexander Koltsov
>            Priority: Critical
>              Labels: pull-request-available
>
> Calcite's union operator is supports more than two input relations. However, 
> Flink's translation rules only consider the first two relations because we 
> assumed that Calcite's union is binary. 
> This problem exists for batch and streaming queries.
> It seems that Calcite only generates non-binary Unions in rare cases 
> ({{(SELECT * FROM t) UNION ALL (SELECT * FROM t) UNION ALL (SELECT * FROM 
> t)}} results in two binary union operators) but the problem definitely needs 
> to be fixed.
> The following query can be used to validate the problem. 
> {code:java}
> @Test
>       public void testValuesWithCast() throws Exception {
>               ExecutionEnvironment env = 
> ExecutionEnvironment.getExecutionEnvironment();
>               BatchTableEnvironment tableEnv = 
> TableEnvironment.getTableEnvironment(env, config());
>               String sqlQuery = "VALUES (1, cast(1 as BIGINT) )," +
>                       "(2, cast(2 as BIGINT))," +
>                       "(3, cast(3 as BIGINT))";
>               String sqlQuery2 = "VALUES (1,1)," +
>                       "(2, 2)," +
>                       "(3, 3)";
>               Table result = tableEnv.sql(sqlQuery);
>               DataSet<Row> resultSet = tableEnv.toDataSet(result, Row.class);
>               List<Row> results = resultSet.collect();
>               Table result2 = tableEnv.sql(sqlQuery2);
>               DataSet<Row> resultSet2 = tableEnv.toDataSet(result2, 
> Row.class);
>               List<Row> results2 = resultSet2.collect();
>               String expected = "1,1\n2,2\n3,3";
>               compareResultAsText(results2, expected);
>               compareResultAsText(results, expected);
>       }
> {code}
> AR for {{results}} variable
> {noformat}
> java.lang.AssertionError: Different elements in arrays: expected 3 elements 
> and received 2
>  expected: [1,1, 2,2, 3,3]
>  received: [1,1, 2,2] 
> Expected :3
> Actual   :2
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to