alamb opened a new issue, #12123: URL: https://github.com/apache/datafusion/issues/12123
### Describe the bug One of the last remaining issues causing test failures when we enable reading StringView by default in https://github.com/apache/datafusion/pull/12092 is as follows: ``` failures: datasource::file_format::parquet::tests::fetch_metadata_with_size_hint datasource::file_format::parquet::tests::read_alltypes_plain_parquet datasource::file_format::parquet::tests::read_binary_alltypes_plain_parquet datasource::file_format::parquet::tests::read_merged_batches datasource::file_format::parquet::tests::test_statistics_from_parquet_metadata ``` ### To Reproduce https://github.com/apache/datafusion/pull/12092 And then run: ```shell cargo test -p datafusion --lib -- file_format::parquet ``` ### Expected behavior The tests should pass ### Additional context The problem is that table schema is configured to be UTF8View but the file schema is using Utf8 (so the stats are returned as Utf8) and the accumulators can't deal updating a Utf8View from Utf8. @XiangpengHao solved this issue in https://github.com/apache/datafusion/pull/11862#discussion_r1727710645 to thread the parameter and then and cast the file schema appropriately. The code isn't great to start with and adding a new parameter makes it worse. I also think there are some bugs lurking there that maybe we could improve if the code was more testable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
