alamb opened a new pull request #8460:
URL: https://github.com/apache/arrow/pull/8460


   This is a PR incorporating the feedback from @nevi-me  and @jorgecarleitao  
from https://github.com/apache/arrow/pull/8400
   
   It adds
   1. a `can_cast_types` function to the Arrow cast kernel (as suggested by 
@jorgecarleitao  / @nevi-me  in 
https://github.com/apache/arrow/pull/8400#discussion_r501850814) that encodes 
the valid type casting
   2. A test that ensures `can_cast_types` and `cast` remain in sync
   3. Bug fixes that the test above uncovered (I'll comment inline)
   4. Change DataFuson to use `can_cast_types` so that it plans casting 
consistently with what arrow allows
   
   Previously the notions of coercion and casting were somewhat conflated in 
DataFusion. I have tried to clarify them in 
https://github.com/apache/arrow/pull/8399 and this PR. See also 
https://github.com/apache/arrow/pull/8340#discussion_r501257096 for more 
discussion.
   
   I am adding this functionality so DataFusion gains rudimentary support 
`DictionaryArray`.
   
   Codewise, I am concerned about the duplication in logic between the match 
statements in `cast` and `can_cast_types. I have some thoughts on how to unify 
them (see https://github.com/apache/arrow/pull/8400#discussion_r504278902), but 
I don't have time to implement that as it is a bigger change. I think this 
approach with some duplication is ok, and the test will ensure they remain in 
sync. 
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to