peter-toth opened a new pull request, #38640:
URL: https://github.com/apache/spark/pull/38640

   ### What changes were proposed in this pull request?
   This PR adds DSv2 plan stability tests to be able to track plan changes with 
new Spark changes.
   
   ### Why are the changes needed?
   We see a few issues with TPVDS DSv2 plans that needs fixes in the future. 
But, as a first step it would be good to track the plan changes.
   Please note that currently:
   - q14a, q14b, q38, q87 are removed from `TPCDSV1_4_V2PlanStabilitySuite`
   - q14, q14a are from `TPCDSV2_7_V2PlanStabilitySuite`
   
   as those queries would fail in`PushDownLeftSemiAntiJoin`. That is because 
the rule decides on pusing down joins over an aggregate if the join can be 
planned as a broadcast join, but the optimization need stats to be available. 
As the batch `Early Filter and Projection Push-Down` hasn't constructed the V2 
Scans,`DataSourceV2Relation.computeStats()` throws an exception due to missing 
accurate stats.
   
   ### Does this PR introduce _any_ user-facing change?
   'No'
   
   ### How was this patch tested?
   This PR adds only new UTs.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to