gengliangwang opened a new pull request, #55635: URL: https://github.com/apache/spark/pull/55635
### What changes were proposed in this pull request? This PR implements group filtering for WriteDelta row level operations. It re-applies #55612 (commit `5ef2e1ba174`, reverted in `8e8fee2692f`) and resolves the test failures reported in https://github.com/apache/spark/pull/55612#issuecomment-4350126373 by updating the scan-count assertions in the transactional check tests in `MergeIntoTableSuiteBase` and `UpdateTableSuiteBase`. With group filtering, `matchingRowsPlan` re-scans the target, and for MERGE `RewritePredicateSubquery` also re-scans the source. For MERGE the delta scan counts now match the non-delta values, so the `deltaMerge` conditionals collapse. For UPDATE the delta counts double but remain under the non-delta values because `ReplaceData` still adds further scans. ### Why are the changes needed? These changes are needed to close the gap in WriteDelta plans. ### Does this PR introduce _any_ user-facing change? Changes are backward compatible. ### How was this patch tested? This PR comes with tests. Locally verified all 9 affected suites are green (517 tests): ``` build/sbt 'sql/testOnly \ org.apache.spark.sql.connector.DeltaBasedMergeIntoTableSuite \ org.apache.spark.sql.connector.DeltaBasedMergeIntoTableWithDeletionVectorsSuite \ org.apache.spark.sql.connector.DeltaBasedMergeIntoTableUpdateAsDeleteAndInsertSuite \ org.apache.spark.sql.connector.DeltaBasedUpdateTableSuite \ org.apache.spark.sql.connector.DeltaBasedUpdateTableWithDeletionVectorsSuite \ org.apache.spark.sql.connector.DeltaBasedUpdateAsDeleteAndInsertTableSuite \ org.apache.spark.sql.connector.DeltaBasedNoMetadataDeleteFromTableSuite \ org.apache.spark.sql.connector.GroupBasedMergeIntoTableSuite \ org.apache.spark.sql.connector.GroupBasedUpdateTableSuite' ``` ### Was this patch authored or co-authored using generative AI tooling? Claude Code v2.1.123. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
