Omega359 commented on PR #20604:
URL: https://github.com/apache/datafusion/pull/20604#issuecomment-3980125922

   > Maybe I'm missing something, but what is the actual advantage of returning 
a `Utf8View` from a UDF like reverse?
   > 
   > For functions like `trim` or `substr` it makes sense to return a 
`Utf8View` regardless of the input type, as it allows for reuse of the 
underlying string buffer, given that each output value is just a slice of 
existing string data. But for a function like `reverse` that creates brand new 
string values, surely returning `Utf8View` just adds extra overhead with no 
benefit?
   
   In a word: consistency.
   
   What you outline is true if you look at the function in isolation. For my 
use case where I use DataFusion in an ETL environment it's rarely just one 
function being applied to a column, rather it's a chain of them. 
   
   I've been attempting to enabled Utf8View in my app for the better part of 
the last month and I'm continuing to encounter issues where df calls are 
emitting utf8view in one case and utf8 in another. Works fine till you try and 
merge the schemas and then .. boom. Finding all of the instances where 
something is consuming utf8view and emitting utf8 and throwing a cast after it 
is tiresome and expensive both in my time and actual test runs.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to