sunjincheng121 opened a new pull request #9322: [Flink-13473][table] Add FlatAggregate support to stream Table API(blink planner) URL: https://github.com/apache/flink/pull/9322 ## What is the purpose of the change This pull request ports non-window flatAggregate from Flink planner to Blink planner. The changes mainly include three parts: blink-planner, blink-runtime and some meta handers. The code of the API part can be reused as the two planner share the same API. ## Brief change log - Add plan support for table aggregate. [279fea2b05](https://github.com/apache/flink/commit/279fea2b05d5fe6c428a1a315f9b903d3740aa52) This commit adds RelNodes(Logical, FlinkLogical and StreamExec) and Rules in blink planner. The execution procedures are: First, the TableAggregate QueryOperation node in API level will be converted to a LogicalTableAggregate relnode in QueryOperationConverter. Second, the LogicalTableAggregate will be converted to a FlinkLogicalTableAggregate by the rule in logical optimization phase. Third, the FlinkLogicalTableAggregate will be converted to a StreamExecGroupTableAggregate by the rule in physical optimization phase. - Add runtime support for table aggregate. [b4c0231e3](https://github.com/apache/flink/commit/b4c0231e30c9d0f37c4eb3d7fa354cbfa6201db3) This commit adds codegen and runtime process function for table aggregate. The AggCodeGen generates a TableAggsHandleFunction which is used in the GroupTableAggFunction KeyedProcessFunction. A TableAggsHandleFunction contains methods like createAccumulators(), accumulate(), emitValue() etc. In this part, The implementation of TableAggregate is very similar to Aggregate. The biggest difference is the emit strategy of TableAggregate. For TableAggregate, it emits multi rows and multi columns, while Aggregate emits single row and single column each time. - Support TableAggregate in some MetadataHandler. [e3d5e780a1](https://github.com/apache/flink/commit/e3d5e780a11ddd27ab6721772925db730d95c75c) MetadataHandler is used to provide Metadata of a relational expression. For example, provide the unique key information of a relnode. The information can be used during optimization. This commit adds TableAggregate support in FlinkRelMdColumnInterval, FlinkRelMdFilteredColumnInterval and FlinkRelMdModifiedMonotonicity. The absence of TableAggregate in other MetadataHandlers mainly because: - The TableAggregate can not provide uniqueness information which is required in FlinkRelMdColumnUniqueness, FlinkRelMdDistinctRowCount, FlinkRelMdUniqueGroups, FlinkRelMdUniqueKeys. - Most of the MetadataHandlers are not been used in streaming jobs,such as FlinkRelMdPercentageOriginalRows,FlinkRelMdPopulationSize, FlinkRelMdRowCount, etc. ## Verifying this change - Adds Table Aggregate Tests for in blink planner. - Adds Meta handler Tests in blink planner. - Adds Harness Tests in blink planner ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): ( no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
