Benjamin0313 opened a new pull request, #16665: URL: https://github.com/apache/iceberg/pull/16665
Map Iceberg's time type to Spark 4.1's TimeType (added in SPARK-51162) for row-based reads and writes across Parquet, ORC, and Avro. Iceberg stores time as microseconds from midnight while Spark stores it as nanoseconds, so values are converted on the boundary (x1000 on read, /1000 on write). Vectorized reads are intentionally left unsupported for now: Spark 4.1's ColumnarBatch (ColumnarBatchRow#get) does not support TimeType, and exposing time through the shared Arrow accessor would require an engine-wide change. SparkBatch therefore falls back to row-based reads when a time column is projected. Co-Authored-By: Claude Opus 4.8 <[email protected]> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
