wiedld opened a new pull request, #13986:
URL: https://github.com/apache/datafusion/pull/13986

   ## Which issue does this PR close?
   
   Part of https://github.com/apache/datafusion/issues/13652
   
   ## Rationale for this change
   
   The [original discussion](https://github.com/apache/datafusion/issues/13525) 
mentioned implicit changes which can cause problems when trying to upgrade 
Datafusion (DF). These implicit changes are often the result of how DF core 
components interact with user-defined extensions which add, and mutate, 
different plan nodes.
   
   We previously [introduced the 
concept](https://github.com/apache/datafusion/issues/13652) of invariants, as a 
way to help faster isolate when an implicit change may conflict with 
user-defined plan extensions. A [previous 
PR](https://github.com/apache/datafusion/pull/13651) introduced the logical 
plan invariants. This PR introduces physical plan invariants.
   
   ## What changes are included in this PR?
   
   This WIP proposes the interface for the execution plan invariant checks. It 
was done a bit differently from the logical plan (LP) invariants. 
   
   The LP is a common enum with the same [invokable 
function](https://github.com/apache/datafusion/blob/38ccb0071045be1fae672ce2561c001f5d505efb/datafusion/expr/src/logical_plan/plan.rs#L1133-L1138)
 for checking invariants (altho the [level of 
validation](https://github.com/apache/datafusion/blob/38ccb0071045be1fae672ce2561c001f5d505efb/datafusion/expr/src/logical_plan/invariants.rs#L31-L41)
 may vary). In contrast, each ExecutionPlan node is its own implementation. 
Therefore the approach was chosen to have the invariant checking be defined on 
the implementations (with a default set of invariants defined on the trait).
   
   As with the LP invariants, the physical plan invariants are checked as part 
of the default planner. Also same as the LP, we have the more costly check only 
run in debug mode.
   
   ## Are these changes tested?
   
   Yes
   
   ## Are there any user-facing changes?
   
   User defined ExecutionPlan extension can define their own set of invariants. 
When a DF upgrade is failing, they can run in debug mode and have their 
`ExecutionPlan::check_node_invariants` run after each optimizer pass. For 
example, this can isolate if an upstream DF optimizer change has produced 
inputs which fails for the user's ExecutionPlan extensions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to