malinjawi opened a new pull request, #12218: URL: https://github.com/apache/gluten/pull/12218
## What changes are proposed in this pull request? Addresses #12195. Delta CDF reads enter Spark as `CDCReader.DeltaCDFRelation`, so they do not initially have the normal `FileSourceScanExec` + `DeltaParquetFileFormat` shape that Gluten's existing Delta scan offload rule recognizes. This PR adds a Gluten Delta planner strategy, wired from the Velox Delta component, that recognizes `DeltaCDFRelation`, expands it through Delta's own CDF batch planning path, and rewrites the original projection/filter attributes onto the expanded logical plan. After that, the existing Delta scan offload path can plan the underlying CDF file scans as `DeltaScanTransformer`. The change is intentionally scoped to Gluten's Delta/Spark planning layer rather than Velox C++: - Add `DeltaCDFScanStrategy` for `table_changes(...)` and DataFrame `readChangeFeed` scans. - Add Delta-version helper shims for Delta 2.3, 2.4, 3.3, and 4.x API differences. - Register the planner strategy from `VeloxDeltaComponent`. - Add Delta regression coverage for insert/update/delete CDF rows, filter/projection handling, bounded version reads, DataFrame `readChangeFeed`, and column mapping. ## How was this patch tested? Local checks used `JAVA_HOME=/opt/homebrew/opt/openjdk@17/libexec/openjdk.jdk/Contents/Home`. - `git diff --check` - `./dev/format-scala-code.sh check` - `./build/mvn -pl gluten-delta -Pspark-3.5 -Pjava-17 -Pbackends-velox -Pdelta -DskipTests test-compile` - `./build/mvn -pl gluten-delta -am -Pspark-3.3 -Pjava-17 -Pbackends-velox -Pdelta -DskipTests -Dcheckstyle.skip=true -Dscalastyle.skip=true -Dspotless.check.skip=true test-compile` - `./build/mvn -pl gluten-delta -am -Pspark-3.4 -Pjava-17 -Pbackends-velox -Pdelta -DskipTests -Dcheckstyle.skip=true -Dscalastyle.skip=true -Dspotless.check.skip=true test-compile` - `./build/mvn -pl gluten-delta -am -Pspark-3.5 -Pjava-17 -Pbackends-velox -Pdelta -DskipTests -Dcheckstyle.skip=true -Dscalastyle.skip=true -Dspotless.check.skip=true test-compile` - `./build/mvn -pl gluten-delta -am -Pspark-4.0 -Pscala-2.13 -Pjava-17 -Pbackends-velox -Pdelta -DskipTests -Dcheckstyle.skip=true -Dscalastyle.skip=true -Dspotless.check.skip=true test-compile` - `./build/mvn -pl gluten-delta -am -Pspark-4.1 -Pscala-2.13 -Pjava-17 -Pbackends-velox -Pdelta -DskipTests -Dcheckstyle.skip=true -Dscalastyle.skip=true -Dspotless.check.skip=true test-compile` Full native Velox runtime and benchmarking are left to CI / a native Gluten-optimized environment; this local checkout does not have `cpp/build/releases/libgluten.so` and the Velox external project build available. ## Was this patch authored or co-authored using generative AI tooling? Generated-by: OpenAI Codex (GPT-5) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
