hequn8128 opened a new pull request #9141: [FLINK-12249][table] Fix type equivalence check problems for Window Aggregates URL: https://github.com/apache/flink/pull/9141 ## What is the purpose of the change This pull request fixes type equivalence check problems for Window Aggregates ### background Creating Aggregate node fails in rules: LogicalWindowAggregateRule and ExtendedAggregateExtractProjectRule if the only grouping expression is a window and we compute aggregation on NON NULLABLE field. The root cause for that, is how return type inference strategies in calcite work and how we handle window aggregates. Take org.apache.calcite.sql.type.ReturnTypes#AGG_SUM as an example, based on groupCount it adjusts type nullability based on groupCount. ### analysis There is no problem for non-window aggregate that return type turn to nullable if the call occurs within a "GROUP BY ()" query, e.g. in "select sum(x) as s from empty", s may be null. This is also the behavior of common databases, such as mysql. The result of query "select sum(x) as s from empty" is a null output. However, for window aggregate, there is no way to output null, as we don't know which window the null belongs to, so the result should be an empty result which means the result type is also not null. ### solution To fix this problem is to change the current sum/avg SqlAggFunction to a self-defined one which avoids turning the "not null" to "nullable" for window aggregate. We can simply use a Rule to archive this. ## Brief change log - Add a self-defined sum and avg SqlAggFunction which will not change return type to nullable even if the call occurs within a "GROUP BY ()" for window aggregates. - Add a Rule(`ReplaceWindowAggregateFunctionRule`) to change the sum/avg to the self-defined sum/avg(`SqlWindowAvgAggFunction` and `SqlWindowSumAggFunction`). - Add `getSqlAggFunctionString` in `CommonAggregate`. This method makes sure that the new self-defined sum/avg returns same string with sum/agg. - Add tests. ## Verifying this change This change added tests and can be verified as follows: - Added plan tests for flink, blink. - Added plan tests for batch, streaming. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
