malinjawi opened a new pull request, #12218:
URL: https://github.com/apache/gluten/pull/12218

   ## What changes are proposed in this pull request?
   
   Addresses #12195.
   
   Delta CDF reads enter Spark as `CDCReader.DeltaCDFRelation`, so they do not 
initially have the normal `FileSourceScanExec` + `DeltaParquetFileFormat` shape 
that Gluten's existing Delta scan offload rule recognizes.
   
   This PR adds a Gluten Delta planner strategy, wired from the Velox Delta 
component, that recognizes `DeltaCDFRelation`, expands it through Delta's own 
CDF batch planning path, and rewrites the original projection/filter attributes 
onto the expanded logical plan. After that, the existing Delta scan offload 
path can plan the underlying CDF file scans as `DeltaScanTransformer`.
   
   The change is intentionally scoped to Gluten's Delta/Spark planning layer 
rather than Velox C++:
   
   - Add `DeltaCDFScanStrategy` for `table_changes(...)` and DataFrame 
`readChangeFeed` scans.
   - Add Delta-version helper shims for Delta 2.3, 2.4, 3.3, and 4.x API 
differences.
   - Register the planner strategy from `VeloxDeltaComponent`.
   - Add Delta regression coverage for insert/update/delete CDF rows, 
filter/projection handling, bounded version reads, DataFrame `readChangeFeed`, 
and column mapping.
   
   ## How was this patch tested?
   
   Local checks used 
`JAVA_HOME=/opt/homebrew/opt/openjdk@17/libexec/openjdk.jdk/Contents/Home`.
   
   - `git diff --check`
   - `./dev/format-scala-code.sh check`
   - `./build/mvn -pl gluten-delta -Pspark-3.5 -Pjava-17 -Pbackends-velox 
-Pdelta -DskipTests test-compile`
   - `./build/mvn -pl gluten-delta -am -Pspark-3.3 -Pjava-17 -Pbackends-velox 
-Pdelta -DskipTests -Dcheckstyle.skip=true -Dscalastyle.skip=true 
-Dspotless.check.skip=true test-compile`
   - `./build/mvn -pl gluten-delta -am -Pspark-3.4 -Pjava-17 -Pbackends-velox 
-Pdelta -DskipTests -Dcheckstyle.skip=true -Dscalastyle.skip=true 
-Dspotless.check.skip=true test-compile`
   - `./build/mvn -pl gluten-delta -am -Pspark-3.5 -Pjava-17 -Pbackends-velox 
-Pdelta -DskipTests -Dcheckstyle.skip=true -Dscalastyle.skip=true 
-Dspotless.check.skip=true test-compile`
   - `./build/mvn -pl gluten-delta -am -Pspark-4.0 -Pscala-2.13 -Pjava-17 
-Pbackends-velox -Pdelta -DskipTests -Dcheckstyle.skip=true 
-Dscalastyle.skip=true -Dspotless.check.skip=true test-compile`
   - `./build/mvn -pl gluten-delta -am -Pspark-4.1 -Pscala-2.13 -Pjava-17 
-Pbackends-velox -Pdelta -DskipTests -Dcheckstyle.skip=true 
-Dscalastyle.skip=true -Dspotless.check.skip=true test-compile`
   
   Full native Velox runtime and benchmarking are left to CI / a native 
Gluten-optimized environment; this local checkout does not have 
`cpp/build/releases/libgluten.so` and the Velox external project build 
available.
   
   ## Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: OpenAI Codex (GPT-5)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to