andygrove opened a new pull request, #3259: URL: https://github.com/apache/datafusion-comet/pull/3259
## Summary - Document that `native_datafusion` and `native_iceberg_compat` do not support datetime rebasing detection - Document that these implementations do not support Spark's Datasource V2 API ## Background While investigating `ParquetDatetimeRebaseSuite` tests that explicitly set `native_comet`, we discovered these are intentional limitations of the DataFusion-based scan implementations, not test issues. ### Datetime Rebasing Parquet files written before Spark 3.0 may contain dates/timestamps using the hybrid Julian/Gregorian calendar. The `native_comet` implementation: - Detects legacy datetime metadata in Parquet files - Can throw `SparkException` when `spark.comet.exceptionOnDatetimeRebase=true` - Or reads values without rebasing (CORRECTED mode) The DataFusion-based implementations (`native_datafusion`, `native_iceberg_compat`) do not have this detection capability and read all dates/timestamps as Proleptic Gregorian, which may produce incorrect results for dates before October 15, 1582. ### Datasource V2 API The DataFusion-based implementations only support Spark's V1 datasource API. When `spark.sql.sources.useV1SourceList` does not include `parquet`, Comet falls back to `native_comet`. ## Test plan - [x] Documentation only change 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
