Re: [PR] feat: support native Comet scan of plain Delta Lake tables [datafusion-comet]

via GitHub Wed, 17 Jun 2026 14:18:57 -0700


adityavaish commented on PR #4669:
URL: 
https://github.com/apache/datafusion-comet/pull/4669#issuecomment-4735610375


   @schenksj would you be willing to review this and share your thoughts? You 
have the most context on native Delta in Comet (the delta-kernel-rs scan in 
#3932 and the contrib restructure in #4366), so your perspective would be very 
valuable.
   
   This PR is deliberately a much smaller, complementary approach to yours: 
behind `spark.comet.scan.delta.enabled` (default off), it just reuses Comet's 
existing native Parquet reader for **plain** Delta tables (no deletion vectors, 
no column mapping), and conservatively falls back to Spark otherwise. It's 
essentially a revival of #3035. It is **not** meant to replace the full 
delta-kernel integration — rather to give the common case a native path today 
while the contrib module matures.
   
   A few specific things I'd love your opinion on:
   1. Does an in-core, flag-gated Tier-0 like this make sense alongside the 
contrib `delta-kernel` work, or would you prefer all Delta support live in 
`contrib/`?
   2. The fallback guards: column mapping is detected via 
`DeltaParquetFileFormat.columnMappingMode` (reflection) and deletion vectors 
via the synthetic `__delta_internal_*` scan columns. Are there other Delta 
features that would make a plain native Parquet read incorrect that I should 
also guard against?
   3. The Delta test dependency is wired per Spark profile (delta-core 2.4.0 
for 3.4; delta-spark 3.3.2/4.0.0/4.1.0/4.2.0 for 3.5/4.0/4.1/4.2). Does that 
match how you handle versioning in the contrib suite?
   
   Thanks for taking a look!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat: support native Comet scan of plain Delta Lake tables [datafusion-comet]

Reply via email to