Omega359 commented on PR #20604: URL: https://github.com/apache/datafusion/pull/20604#issuecomment-3998230418
> So, given that both `Utf8` and `Utf8View` materialise into the same physical representation in the parquet files, would a simply solution for your use case be to configure datafusion (or whatever system is reading back these parquet files) to always read in these fields as the same arrow type? I think datafusion *should* because of the default being true for `schema_force_view_types` but apparently not in whatever code path I'm triggering. My guess is that because I'm inferring schema for a table based on s3 data and that code just grabs the schema from the first file (generated by duckdb in this case) it somehow is assigning utf8 to the column(s). Just a guess though. I'm doing one more test today and if that doesn't work I'm switching back to utf8 everywhere and will come back to this in probably a few months. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
