wiedld opened a new pull request, #13651:
URL: https://github.com/apache/datafusion/pull/13651

   ## Which issue does this PR close?
   
   For discussion re: #13525 
   
   ## Rationale for this change
   
   There are many implicit changes (not explicit API changes) which can result 
in unintended consequences when upgrading DF. This WIP is an example of 
handling one of these implicit changes: specifically, implicit LP changes 
between analyzer and optimizer passes which can result in side effects 
downstream (which is harder to debug). 
   
   It attempts to handle this issue by:
   * define the invariants
   * check the invariants for extensible interfaces (which may be user defined)
   * throw the error closer to the problem (itself of weird behavior later)
   
   The example here addresses only the invariants for all analyzer and 
optimizer rules. The individual rules have their own invariants which should be 
checked there (altho perhaps we need to make this more explicit in the rule 
API?).
   
   ## What changes are included in this PR?
   
   It provides a rough example of:
   * making the explicit contract (of what can and cannot change) in the 
analyzer vs optimizer passes
   * early failure after the user-defined modular component (e.g. 
[AnalyzerRule](https://docs.rs/datafusion/43.0.0/datafusion/optimizer/trait.AnalyzerRule.html)
 or 
[OptimizerRule](https://docs.rs/datafusion/43.0.0/datafusion/optimizer/trait.OptimizerRule.html))
   * it does two example checks:
       * for the one of the invariants (mentioned [in the docs 
here](https://datafusion.apache.org/contributor-guide/specification/invariants.html#relation-name-tuples-in-logical-fields-and-logical-columns-are-unique))
       * for a union schema invariant which InfluxDB has hit a few times 
ourselves
   
   ## Are these changes tested?
   
   N/A. Is a WIP.
   Altho the current checks do cause some existing tests to fail.
   
   ## Are there any user-facing changes?
   
   No.
   Altho it hopefully surfaces errors sooner for user-defined analyzer and 
optimizer rules.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to