[PR] feat: Native Parquet Iceberg Data File Writes In Comet [datafusion-comet]

via GitHub Wed, 27 May 2026 18:50:56 -0700


jordepic opened a new pull request, #4487:
URL: https://github.com/apache/datafusion-comet/pull/4487


   ## Which issue does this PR close?
   
   Closes #4322.
   
   ## Rationale for this change
   
   Comet, up until this point, has mainly been focused with accelerating reads 
from iceberg tables.  However, a significant resources are being spent across 
various companies in order to rewrite iceberg data.  Large tables need to be 
compacted to maintain their sort/Z order, and general data pipelines may write 
significant amounts of iceberg data.  Having to do a large transpose between 
columnar and row-wise data is inefficient, and we'd much prefer to go directly 
from arrow-based column batches to parquet on disk.
   
   ## What changes are included in this PR?
   
   This change is split into three parts.
   
   1) Splitting the existing iceberg V2 spark write command into two: a 
"writer" spark operator and a "committer" spark operator.
   - This allows the iceberg data file write to be treated identically to a 
spark V1 parquet writing command, thereby allowing using similar code to handle 
the operator
   - Without it, both the data file writing and committing would live outside 
of the scope of spark adaptive query execution operator wrappers - we instead 
want the write operator to be within AQE so that as the data feeds into it gets 
re-planned we can determine whether the upstream data of the write is columnar
   
   2) Determining which operators should be converted to native code
   - This follows a simple philosophy: only convert writes to native code if 
they produce an "identical" outcome as the Java path
   - Disclaimer: nothing truly produces an identical result because parquet row 
group flushing is different
   - Besides that though, we can generally convert "normal" writes (parquet, 
default settings, some flexibility to change other settings as outlined in 
`iceberg-writes.md`, no delete files since iceberg-rust doesn't support 
positional deletes/DVs)
   
   3) Native iceberg write operator
   - The JVM is responsible for computing all iceberg writing settings and 
using a protobuf object to pass them to rust
   - Rust uses iceberg-rs in order to write the file and return avro-encoded 
dummy manifest bytes back to the JVM
   
   ## How are these changes tested?
   
   This change is tested extensively.
   1) Tests to ensure that iceberg writes are replaced by our "two-operator" 
structure
   2) Tests to ensure that the comet JVM properly serializes relevant data to 
protocol buffer form to go to the native layer
   3) Tests to ensure that native writes are only performed under very specific 
iceberg table properties
   4) Tests to ensure that native writes actually function as expected
   5) Tests to ensure that compaction/sorting/z-ordering can now be fully 
accelerated with native writing
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] feat: Native Parquet Iceberg Data File Writes In Comet [datafusion-comet]

Reply via email to