[PR] [AURON #2015] Add Native Scan Support for Apache Iceberg Copy-On-Write Tables. [auron]

via GitHub Tue, 17 Feb 2026 17:12:18 -0800


slfan1989 opened a new pull request, #2016:
URL: https://github.com/apache/auron/pull/2016


   <!--
     - Start the PR title with the related issue ID, e.g. '[AURON #XXXX] Short 
summary...'.
   -->
   # Which issue does this PR close?
   
   Closes #<issue_number>
   
   ### Rationale for this change
   
   This PR adds native scan support for Apache Iceberg Copy-On-Write (COW) 
tables to improve query performance. Currently, Auron lacks direct integration 
with Iceberg, forcing all Iceberg queries to use Spark's native execution path, 
missing opportunities for native engine acceleration.
   
   #### Key Motivations:
   
   - Enable Auron's native execution engine to read Iceberg tables directly
   - Leverage native performance optimizations for Iceberg COW tables
   - Provide automatic fallback to Spark scan for unsupported scenarios
   - Lay the foundation for future Iceberg feature enhancements (MOR tables, 
pruning predicates, etc.)
   
   ### What changes are included in this PR?
   
   #### Core Implementation:
   
   - **IcebergConvertProvider** - SPI extension point that detects Iceberg 
scans and decides whether to use native execution
   - **IcebergScanSupport** - Decision logic that validates scan plans and 
checks for COW table eligibility
   - **NativeIcebergTableScanExec** - Native execution node that converts 
Iceberg FileScanTask to native scan plans
   
   #### Build & Configuration:
   - Updated `pom.xml` with Iceberg version management and Maven enforcer rules
   - Modified `auron-build.sh` to support Iceberg build parameters
   - Added configuration option: `spark.auron.enable.iceberg.scan` (default: 
true)
   
   #### Supported Features:
   - Iceberg COW tables (Parquet and ORC formats)
   - Projection pushdown (column pruning)
   - Partitioned and non-partitioned tables
   - Automatic fallback for unsupported scenarios
   
   #### Version Support:
   - Spark: 3.4, 3.5, 4.0 only
   - Iceberg: 1.10.1 only (enforced by Maven)
   
   ### Are there any user-facing changes?
   
   **No Breaking Changes**: Existing functionality remains unchanged. Iceberg 
support is additive and disabled by default in unsupported scenarios.
   
   ### How was this patch tested?
   
   #### Unit & Integration Tests:
   
   - Added 10 integration test cases in AuronIcebergIntegrationSuite:
     - Simple COW table scan
     - Projection pushdown
     - Partitioned table with partition filter
     - Orc format support
     - Empty table handling
     - Residual filters fallback
     - Metadata columns fallback
     - Decimal type fallback
     - Delete files (MOR) fallback
     - Configuration toggle functionality
   
   #### Test Environment:
   - Spark versions: 3.4.4, 3.5.8, 4.0.2
   - Iceberg version: 1.10.1
   - File formats: Parquet, ORC
   - File formats: Parquet, ORC


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] [AURON #2015] Add Native Scan Support for Apache Iceberg Copy-On-Write Tables. [auron]

Reply via email to