dor-bernstein opened a new issue, #2452: URL: https://github.com/apache/iceberg-rust/issues/2452
### Describe the bug When using Apache Comet 0.16 with Iceberg (Spark 3.5.6, AWS Glue catalog), scanning certain Parquet files fails with: ``` org.apache.comet.CometNativeException: Iceberg scan error: Unexpected => file scan task generate failed, source: Unexpected => Parquet file metadata does not contain a column index ``` This appears to affect Parquet files that were written before column indexes were standard (i.e. migrated/older files that lack column index metadata). ### Steps to reproduce 1. Use Apache Comet 0.16 with Spark 3.5.6 and Iceberg (AWS Glue catalog) 2. Run a query against an Iceberg table whose Parquet files lack a column index in their metadata ### Expected behavior Iceberg should handle Parquet files that don't have a column index gracefully, falling back to row group statistics or skipping column index pruning. ### Additional context Reported in [apache/datafusion-comet#4125](https://github.com/apache/datafusion-comet/issues/4125#issuecomment-4431270259). A Comet maintainer suggested this is likely an iceberg-rust issue since it relates to migrated Parquet files that lack column index metadata. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
