aokolnychyi commented on code in PR #38557:
URL: https://github.com/apache/spark/pull/38557#discussion_r1017352098
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/RowLevelOperationRuntimeGroupFiltering.scala:
##########
@@ -89,10 +88,8 @@ case class
RowLevelOperationRuntimeGroupFiltering(optimizeSubqueries: Rule[Logic
buildKeys: Seq[Attribute],
pruningKeys: Seq[Attribute]): Expression = {
- val buildQuery = Project(buildKeys, matchingRowsPlan)
- val dynamicPruningSubqueries = pruningKeys.zipWithIndex.map { case (key,
index) =>
- DynamicPruningSubquery(key, buildQuery, buildKeys, index,
onlyInBroadcast = false)
- }
- dynamicPruningSubqueries.reduce(And)
+ val buildQuery = Aggregate(buildKeys, buildKeys, matchingRowsPlan)
Review Comment:
Got it. I was originally worried we could miss some future optimizations
given that dynamic pruning for row-level operations would go through a
different route compared to the normal DPP.
One alternative could be to extend `DynamicPruningSubquery` with a flag
whether it should be optimized or not. Up to you, though.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]