leaves12138 opened a new pull request, #8182: URL: https://github.com/apache/paimon/pull/8182
## What changed - Add a data-evolution delete rewriter that rewrites normal and blob files by retained row-id ranges. - Wire matched `DELETE` actions into Spark DataEvolution `MERGE INTO`, including mixed update/delete sequencing and Spark 4.0 parity. - Add RowTracking and Blob coverage for merge delete and update+delete cases. ## Why DataEvolution tables currently handle merge updates/inserts, but matched deletes need to physically rewrite the affected row-id ranges so normal files and corresponding blob files stay aligned. ## Validation - `mvn -pl paimon-spark/paimon-spark-common -am -Pspark3,fast-build -DskipTests compile` - `JAVA_HOME=/Users/yejunhao/Library/Java/JavaVirtualMachines/ms-17.0.16/Contents/Home PATH=/Users/yejunhao/Library/Java/JavaVirtualMachines/ms-17.0.16/Contents/Home/bin:$PATH mvn -pl paimon-spark/paimon-spark-4.0 -am -Pspark4,fast-build -DskipTests compile` - `mvn -pl paimon-spark/paimon-spark-3.5 -am -Pspark3,fast-build -DfailIfNoTests=false -DwildcardSuites=org.apache.paimon.spark.sql.RowTrackingTest -Dtest=none test` ran 37 tests; the new data-evolution delete tests did not fail, while 5 existing local failures hit codegen loader / generated class instantiation issues. - `mvn -pl paimon-spark/paimon-spark-ut -am -Pspark3,fast-build -DfailIfNoTests=false -DwildcardSuites=org.apache.paimon.spark.sql.BlobTestWithV2Write -Dtest=none test` ran 15 tests; the new blob delete test did not fail, while 2 existing local failures hit the same codegen loader issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
