[jira] [Commented] (FLINK-3475) DISTINCT aggregate function support for SQL queries

ASF GitHub Bot (JIRA) Fri, 17 Feb 2017 03:19:39 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-3475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871698#comment-15871698
 ]


ASF GitHub Bot commented on FLINK-3475:
---------------------------------------

Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/3111#discussion_r101733431
  
    --- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/batch/sql/AggregationsITCase.scala
 ---
    @@ -213,34 +213,45 @@ class AggregationsITCase(
         TestBaseUtils.compareResultAsText(results.asJava, expected)
       }
     
    -  @Test(expected = classOf[TableException])
    +  @Test
       def testDistinctAggregate(): Unit = {
     
         val env = ExecutionEnvironment.getExecutionEnvironment
         val tEnv = TableEnvironment.getTableEnvironment(env, config)
     
         val sqlQuery = "SELECT sum(_1) as a, count(distinct _3) as b FROM 
MyTable"
     
    -    val ds = CollectionDataSets.get3TupleDataSet(env)
    -    tEnv.registerDataSet("MyTable", ds)
    +    val ds = env.fromElements(
    +      (1, 1L, 1.0f, "Hello"),
    +      (2, 2L, 1.0f, "Ciao")).toTable(tEnv)
    +    tEnv.registerTable("MyTable", ds)
     
    -    // must fail. distinct aggregates are not supported
    -    tEnv.sql(sqlQuery).toDataSet[Row]
    +    val result = tEnv.sql(sqlQuery)
    +
    +    val expected = "3,1"
    +    val results = result.toDataSet[Row].collect()
    +    TestBaseUtils.compareResultAsText(results.asJava, expected)
       }
     
    -  @Test(expected = classOf[TableException])
    +  @Test
       def testGroupedDistinctAggregate(): Unit = {
     
         val env = ExecutionEnvironment.getExecutionEnvironment
         val tEnv = TableEnvironment.getTableEnvironment(env, config)
     
         val sqlQuery = "SELECT _2, avg(distinct _1) as a, count(_3) as b FROM 
MyTable GROUP BY _2"
     
    -    val ds = CollectionDataSets.get3TupleDataSet(env)
    -    tEnv.registerDataSet("MyTable", ds)
    +    val ds = env.fromElements(
    --- End diff --
    
    I think it would be good to use a bit more test data here, like on of the 
`CollectionDataSets`.
    ITCases are rather expensive to run, so we should to get the most out of 
them.


> DISTINCT aggregate function support for SQL queries
> ---------------------------------------------------
>
>                 Key: FLINK-3475
>                 URL: https://issues.apache.org/jira/browse/FLINK-3475
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table API & SQL
>            Reporter: Chengxiang Li
>            Assignee: Zhenghua Gao
>
> DISTINCT aggregate function may be able to reuse the aggregate function 
> instead of separate implementation, and let Flink runtime take care of 
> duplicate records.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-3475) DISTINCT aggregate function support for SQL queries

Reply via email to