Re: [PR] Update reverse UDF to emit utf8view when input is utf8view [datafusion]

via GitHub Wed, 04 Mar 2026 07:21:47 -0800


Omega359 commented on PR #20604:
URL: https://github.com/apache/datafusion/pull/20604#issuecomment-3998230418


   > So, given that both `Utf8` and `Utf8View` materialise into the same 
physical representation in the parquet files, would a simply solution for your 
use case be to configure datafusion (or whatever system is reading back these 
parquet files) to always read in these fields as the same arrow type?
   
   I think datafusion *should* because of the default being true for 
`schema_force_view_types` but apparently not in whatever code path I'm 
triggering. My guess is that because I'm inferring schema for a table based on 
s3 data and that code just grabs the schema from the first file (generated by 
duckdb in this case) it somehow is assigning utf8 to the column(s). Just a 
guess though. I'm doing one more test today and if that doesn't work I'm 
switching back to utf8 everywhere and will come back to this in probably a few 
months.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Update reverse UDF to emit utf8view when input is utf8view [datafusion]

Reply via email to