[
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15536178#comment-15536178
]
Anton Mushin commented on FLINK-4604:
-------------------------------------
[~twalthr],thanks!
I have any question about this issue.
1. Am I correct understand in general the test should look like as
{code:java}
public void testNewAggregationFunctions() throws Exception {
//set env ....
String sqlQuery = "SELECT
AVG(x),STDDEV_POP(x),STDDEV_SAMP(x),VAR_POP(x),VAR_SAMP(x) FROM table";
Table result = tableEnv.sql(sqlQuery);
/*
AVG(x) = SUM(x) / COUNT(x)
STDDEV_POP(x) = SQRT( (SUM(x * x) - SUM(x) * SUM(x) / COUNT(x))
/ COUNT(x))
STDDEV_SAMP(x)= SQRT((SUM(x * x) - SUM(x) * SUM(x) / COUNT(x))
/ CASE COUNT(x) WHEN 1 THEN NULL ELSE COUNT(x) - 1 END)
VAR_POP(x)= (SUM(x * x) - SUM(x) * SUM(x) / COUNT(x))/ COUNT(x)
VAR_SAMP(x) = (SUM(x * x) - SUM(x) * SUM(x) / COUNT(x))/ CASE
COUNT(x) WHEN 1 THEN NULL ELSE COUNT(x) - 1 END
*/
String sqlQuery1 = "SELECT SUM(x)/COUNT(x), " +
"SQRT( (SUM(x * x) - SUM(x) * SUM(x) / COUNT(x)) /
COUNT(x)), " +
"SQRT( (SUM(x * x) - SUM(x) * SUM(x) / COUNT(x)) / CASE
COUNT(x) WHEN 1 THEN NULL ELSE COUNT(x) - 1 END), " +
"(SUM(x * x) - SUM(x) * SUM(x) / COUNT(x)) / COUNT(x),
" +
"(SUM(x * x) - SUM(x) * SUM(x) / COUNT(x)) / CASE
COUNT(x) WHEN 1 THEN NULL ELSE COUNT(x) - 1 END " +
"FROM table";
Table result1 = tableEnv.sql(sqlQuery1);
DataSet<Row> resultSet = tableEnv.toDataSet(result, Row.class);
List<Row> results = resultSet.collect();
DataSet<Row> expectedResultSet = tableEnv.toDataSet(result1,
Row.class);
List<Row> expectedResults = expectedResultSet.collect();
compareResult(results, expected);
}
{code}
+some single test for each new function?
2. For support AggregateReduceFunctionsRule I should add to
org.apache.flink.api.table.plan.rules.FlinkRuleSets
AggregateReduceFunctionsRule.INSTANCE and implement logic
STDDEV_POP,STDDEV_SAMP,VAR_POP,VAR_SAMP in package
org.apache.flink.api.table.runtime.aggregate like for AvgAggregate, isn't it?
And define new aggregate functions in other places where it is necessary, for
example: org.apache.flink.api.table.validate.FunctionCatalog
> Add support for standard deviation/variance
> -------------------------------------------
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
> Issue Type: New Feature
> Components: Table API & SQL
> Reporter: Timo Walther
> Assignee: Anton Mushin
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP,
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test
> and document this rule.
> If we also want to add this aggregates to Table API is up for discussion.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)