BenoitHanotte opened a new pull request #10854: [FLINK-15577][table-planner] 
Fix similar aggregations with different windows being considered the same
URL: https://github.com/apache/flink/pull/10854
 
 
   ## What is the purpose of the change
   
   The RelNode's digest is used by the Calcite HepPlanner to avoid adding 
duplicate vertices to the graph. If an equivalent vertex was already present in 
the graph, then that vertex is used in place of the newly generated one. This 
means that the digest needs to contain all the information
   necessary to identifying a vertex and distinguishing it from similar (but 
not equivalent) vertices.
   
   In the case of the `WindowAggregation` nodes, the window specs are currently 
not in the digest, meaning that **two aggregations with the same signatures and 
expressions but different windows are considered equivalent by the planner, 
which is not correct and will lead to an invalid Physical Plan**.
   
   This commit fixes this issue and adds a test ensuring that the window specs 
are in the digest, as well as similar aggregations on two different windows 
will not be considered equivalent.
   
   Only the old planner is subject to the issue, the Blink planner correctly 
uses the window specs in the nodes' digests, allowing Blink to correctly 
differentiate between the nodes.
   
   More info and an example of an invalid plan are avalaible at 
https://issues.apache.org/jira/browse/FLINK-15577 
   
   ## Brief change log
   
   Added window specs to the following RelNodes:
   - LogicalWindowAggregate
   - FlinkLogicalWindowAggregate
   - LogicalWindowTableAggregate
   - FlinkLogicalWindowTableAggregate
   
   Added unit tests to the legacy planner and to the blink planner to prevent 
future regressions.
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   - Added unit tests for the old planner to ensure window specs are in digest 
and that similar aggregations with different windows are not considered 
equivalent in the physical plan.
   - Added unit tests for the blink planner to ensure no such regression can be 
introduced in the future.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
     - The serializers: no
     - The runtime per-record code paths (performance sensitive): no
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
     - The S3 file system connector: no
   
   ## Documentation
   
     - Does this pull request introduce a new feature? no
     - If yes, how is the feature documented? not applicable

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to