gengliangwang opened a new pull request, #55635:
URL: https://github.com/apache/spark/pull/55635

   ### What changes were proposed in this pull request?
   
   This PR implements group filtering for WriteDelta row level operations.
   
   It re-applies #55612 (commit `5ef2e1ba174`, reverted in `8e8fee2692f`) and 
resolves the test failures reported in 
https://github.com/apache/spark/pull/55612#issuecomment-4350126373 by updating 
the scan-count assertions in the transactional check tests in 
`MergeIntoTableSuiteBase` and `UpdateTableSuiteBase`. With group filtering, 
`matchingRowsPlan` re-scans the target, and for MERGE 
`RewritePredicateSubquery` also re-scans the source. For MERGE the delta scan 
counts now match the non-delta values, so the `deltaMerge` conditionals 
collapse. For UPDATE the delta counts double but remain under the non-delta 
values because `ReplaceData` still adds further scans.
   
   ### Why are the changes needed?
   
   These changes are needed to close the gap in WriteDelta plans.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Changes are backward compatible.
   
   ### How was this patch tested?
   
   This PR comes with tests. Locally verified all 9 affected suites are green 
(517 tests):
   
   ```
   build/sbt 'sql/testOnly \
     org.apache.spark.sql.connector.DeltaBasedMergeIntoTableSuite \
     
org.apache.spark.sql.connector.DeltaBasedMergeIntoTableWithDeletionVectorsSuite 
\
     
org.apache.spark.sql.connector.DeltaBasedMergeIntoTableUpdateAsDeleteAndInsertSuite
 \
     org.apache.spark.sql.connector.DeltaBasedUpdateTableSuite \
     
org.apache.spark.sql.connector.DeltaBasedUpdateTableWithDeletionVectorsSuite \
     org.apache.spark.sql.connector.DeltaBasedUpdateAsDeleteAndInsertTableSuite 
\
     org.apache.spark.sql.connector.DeltaBasedNoMetadataDeleteFromTableSuite \
     org.apache.spark.sql.connector.GroupBasedMergeIntoTableSuite \
     org.apache.spark.sql.connector.GroupBasedUpdateTableSuite'
   ```
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Claude Code v2.1.123.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to