[I] Enable native BatchScanExec for Iceberg COW tables [auron]

via GitHub Sat, 29 Nov 2025 22:43:12 -0800


ShreyeshArangath opened a new issue, #1676:
URL: https://github.com/apache/auron/issues/1676


   **Is your feature request related to a problem? Please describe.**
   As detailed in #1472, Auron currently doesn’t support native execution for 
DSv2 reads using Iceberg. BatchScanExec plans for Iceberg tables are executed 
by Spark only, so Iceberg reads (especially COW tables) cannot benefit from 
native acceleration.
   
   **Describe the solution you'd like**
   Introduce a NativeIcebergBatchScanExec that:
   - Hooks into the existing convert provider infrastructure to convert 
BatchScanExec plans backed by SparkBatchQueryScan (Iceberg) into a native scan.
   - Converts Iceberg InputPartition / FileScanTask into FilePartition + 
PartitionedFile compatible with the existing native file scan protobufs 
(Parquet/ORC).
   - Supports COW Iceberg tables for Parquet/ORC in the initial version, 
guarded by a feature flag (`spark.auron.enable.iceberg.scan`).
   
   **Describe alternatives you've considered**
   N/A
   
   **Additional context**
   - Initial scope: COW tables only, Parquet/ORC, basic predicate pushdown 
where already supported by the existing native pruning expression converters.
   - MOR/delete-file handling, time-travel, and metadata table support can be 
handled in follow-up issues once the base NativeIcebergBatchScanExec is in-place


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] Enable native BatchScanExec for Iceberg COW tables [auron]

Reply via email to