sunjincheng121 opened a new pull request #9322: [Flink-13473][table] Add 
FlatAggregate support to stream Table API(blink planner)
URL: https://github.com/apache/flink/pull/9322
 
 
   ## What is the purpose of the change
   This pull request ports non-window flatAggregate from Flink planner to Blink 
planner.
   The changes mainly include three parts: blink-planner, blink-runtime and 
some meta handers. The code of the API part can be reused as the two planner 
share the same API.
   
   ## Brief change log
   - Add plan support for table aggregate. 
[279fea2b05](https://github.com/apache/flink/commit/279fea2b05d5fe6c428a1a315f9b903d3740aa52)
   This commit adds RelNodes(Logical, FlinkLogical and StreamExec) and Rules in 
blink planner. The execution procedures are:
   First, the TableAggregate QueryOperation node in API level will be converted 
to a LogicalTableAggregate relnode in QueryOperationConverter.
   Second, the LogicalTableAggregate will be converted to a 
FlinkLogicalTableAggregate by the rule in logical optimization phase.
   Third, the FlinkLogicalTableAggregate will be converted to a 
StreamExecGroupTableAggregate by the rule in physical optimization phase.
   
   - Add runtime support for table aggregate. 
[b4c0231e3](https://github.com/apache/flink/commit/b4c0231e30c9d0f37c4eb3d7fa354cbfa6201db3)
   This commit adds codegen and runtime process function for table aggregate. 
The AggCodeGen generates a TableAggsHandleFunction which is used in the 
GroupTableAggFunction KeyedProcessFunction. A TableAggsHandleFunction contains 
methods like createAccumulators(), accumulate(), emitValue() etc.
   In this part, The implementation of TableAggregate is very similar to 
Aggregate. The biggest difference is the emit strategy of TableAggregate. For 
TableAggregate, it emits multi rows and multi columns, while Aggregate emits 
single row and single column each time.
   
   - Support TableAggregate in some MetadataHandler. 
[e3d5e780a1](https://github.com/apache/flink/commit/e3d5e780a11ddd27ab6721772925db730d95c75c)
   MetadataHandler is used to provide Metadata of a relational expression. For 
example, provide the unique key information of a relnode. The information can 
be used during optimization.
   This commit adds TableAggregate support in FlinkRelMdColumnInterval, 
FlinkRelMdFilteredColumnInterval and FlinkRelMdModifiedMonotonicity. The 
absence of TableAggregate in other MetadataHandlers mainly because:
   
       - The TableAggregate can not provide uniqueness information which is 
required in FlinkRelMdColumnUniqueness, FlinkRelMdDistinctRowCount, 
FlinkRelMdUniqueGroups, FlinkRelMdUniqueKeys.
       - Most of the MetadataHandlers are not been used in streaming jobs,such 
as FlinkRelMdPercentageOriginalRows,FlinkRelMdPopulationSize, 
FlinkRelMdRowCount, etc.
   
   ## Verifying this change
     - Adds Table Aggregate Tests for  in blink planner.
     - Adds  Meta handler Tests in blink planner.
     - Adds Harness Tests in blink planner
   
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (no)
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
     - The serializers: (no)
     - The runtime per-record code paths (performance sensitive): ( no)
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
     - The S3 file system connector: (no)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (no)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to