findepi commented on issue #12644:
URL: https://github.com/apache/datafusion/issues/12644#issuecomment-3158601704

   metadata-based types like Arrow extension types is definitely a viable and 
incrementally executable path.
   However, the end result is remotely far from optimal: there is official type 
system represented by `DataType` and there is _actual_ type system represented 
by `DataType + properties bag`. Eventually, every use of `DataType` will need 
to be revisited and updated.
   
   > For example could one make a UUID type stored as a fixed length binary 
that works with col = 'abc' as well as col = 'a-b-c'?
   
   This is good example, and it's easy to create many others like this.
   JSON backed by either Utf8 or Binary, what's the equality function? 
byte-for-byte, or structural?
   VARIANT backed by Binary, containing serialized data. The equality _must_ 
deserialize.
   UUID backed by  ...
   In none of these cases the backing type (Utf8 or Binary) has any meaning and 
cannot be used for anything. Yet, it's a valid `DataType` so it _can be 
(erroneously) used for something_, be it an optimizer rule, a function call, a 
coercion logic, etc.
   
   Is this the end state we want to achieve?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to