jorgecarleitao opened a new pull request #8024:
URL: https://github.com/apache/arrow/pull/8024


   This commit makes all type coercion happen on the physical plane instead of 
logical plane and fixes the supertype function. This makes field names to not 
change due to coercion rules, better control of how the coercion supports 
physical calculations, and others.
   
   This commit also makes it more clear how we enforce type checking during 
planning. the Logical plan now knows how to derive its schema directly from 
binary expressions, even before the coercion is applied.
   
   The rational for this change is that coercions are simplifications to a 
physical computation (it is easier to sum two numbers of the same type at the 
hardware level).
   
   This automatically solves ARROW-9809, an issue on which the physical schema 
could be modified by coercion rules, causing the RecordBatch's schema to be 
different from the logical batch.
   
   This also addresses some inconsistencies in how we coerced certain types for 
binary operators, causing such inconsistencies to error during planning instead 
of execution.
   
   This also introduces a significant number of tests into the overall 
consistency of binary operators: it is now explicit what types they expect and 
how coercion happens to each operator. It also adds tests to different parts of 
the physical execution, to ensure schema consistency for binary operators, 
including negative tests (when it should error).
   
   This also makes `like` and `nlike` generally available, and added some tests 
to it.
   
   This closes at least ARROW-9809 and ARROW-4957.
   
   @andygrove  and @alamb, I am really sorry for this long commit, but I was 
unable to split this in smaller parts with passing tests. There was a strong 
coupling between the `get_supertype` and the physical expressions that made it 
hard to work this through.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to