andygrove opened a new issue, #3311: URL: https://github.com/apache/datafusion-comet/issues/3311
## Summary 8 Spark SQL tests fail because they expect a `SparkException` when reading Parquet with mismatched schemas, but `native_datafusion` handles type widening/mismatches gracefully instead of throwing. ## Failing Tests - `ParquetSchemaSuite`: "schema mismatch failure error message for parquet vectorized reader" - `ParquetSchemaSuite`: "SPARK-45604: schema mismatch failure error on timestamp_ntz to array<timestamp_ntz>" - `ParquetQuerySuite`: "SPARK-36182: can't read TimestampLTZ as TimestampNTZ" - `ParquetQuerySuite`: "SPARK-34212 Parquet should read decimals correctly" - `ParquetIOSuite`: "SPARK-35640: read binary as timestamp should throw schema incompatible error" - `ParquetIOSuite`: "SPARK-35640: int as long should throw schema incompatible error" - `FileBasedDataSourceSuite`: "Spark native readers should respect spark.sql.caseSensitive" - `ParquetReadSuite`: "row group skipping doesn't overflow when reading into larger type" ## Root Cause These tests assert that reading with a mismatched schema throws an exception. Native DataFusion silently handles these cases (type widening, schema evolution). This is a test assumptions issue — the tests need guards to skip when `native_datafusion` is active, or expectations need updating. ## Related Discovered in CI for #3307 (enable native_datafusion in auto scan mode). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
