Rafferty97 commented on issue #20585:
URL: https://github.com/apache/datafusion/issues/20585#issuecomment-3987760466

   I have a concern I think is worth discussing before we commit to this 
approach. Specifically, I think that UDFs should be free to return whatever 
physical type is cheapest for them to produce, and to restrain them to 
mirroring their input type(s) would unnecessarily cap query performance. For 
example:
   - Functions like `trim` and `substr` are just views into the underlying 
data, so returning `Utf8View` is a clear performance win even if the input 
types aren't string views
   - Conversly, functions like `reverse` always allocate new data buffers, so 
it is more space efficient to return `Utf8` or `LargeUtf8` rather than 
`Utf8View`, as their offset buffers take up 4/8 bytes per element as opposed to 
`Utf8View`'s 16 bytes.
   
   I understand there's an argument to be made about consistency, but the 
logical/physical planners appear to already be capable of inserting casts where 
needed to ensure these types can mix well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to