subject:"\[GitHub\] \[spark\] EnricoMi commented on a diff in pull request #39131\: \[SPARK\-41162\]\[SQL\] Fix anti\- and semi\-join for self\-join with aggregations"

[GitHub] [spark] EnricoMi commented on a diff in pull request #39131: [SPARK-41162][SQL] Fix anti- and semi-join for self-join with aggregations

2023-01-14 Thread GitBox



EnricoMi commented on code in PR #39131:
URL: https://github.com/apache/spark/pull/39131#discussion_r1070415424


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PushDownLeftSemiAntiJoin.scala:
##
@@ -61,9 +61,10 @@ object PushDownLeftSemiAntiJoin extends Rule[LogicalPlan]
   }
 
 // LeftSemi/LeftAnti over Aggregate, only push down if join can be planned 
as broadcast join.
-case join @ Join(agg: Aggregate, rightOp, LeftSemiOrAnti(_), _, _)
+case join @ Join(agg: Aggregate, rightOp, LeftSemiOrAnti(_), joinCond, _)
 if agg.aggregateExpressions.forall(_.deterministic) && 
agg.groupingExpressions.nonEmpty &&
   
!agg.aggregateExpressions.exists(ScalarSubquery.hasCorrelatedScalarSubquery) &&
+  canPushThroughCondition(agg.children, joinCond, rightOp) &&

Review Comment:
   I don't understand. The `canPushThroughCondition` is called before the 
`Join` is being pushed through the `Aggregate`, it has been added to prevent 
this from happening in this situation. The other cases (e.g. `Union`) are 
calling into `canPushThroughCondition` equivalently.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] EnricoMi commented on a diff in pull request #39131: [SPARK-41162][SQL] Fix anti- and semi-join for self-join with aggregations

2023-01-04 Thread GitBox



EnricoMi commented on code in PR #39131:
URL: https://github.com/apache/spark/pull/39131#discussion_r1062152718


##
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/LeftSemiAntiJoinPushDownSuite.scala:
##
@@ -46,7 +46,7 @@ class LeftSemiPushdownSuite extends PlanTest {
   val testRelation1 = LocalRelation($"d".int)
   val testRelation2 = LocalRelation($"e".int)
 
-  test("Project: LeftSemiAnti join pushdown") {
+  test("Project: LeftSemi join pushdown") {

Review Comment:
   The term `LeftSemiAnti` is wrong and misleading for individual tests, 
correcting this while I am touching the file.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] EnricoMi commented on a diff in pull request #39131: [SPARK-41162][SQL] Fix anti- and semi-join for self-join with aggregations

[GitHub] [spark] EnricoMi commented on a diff in pull request #39131: [SPARK-41162][SQL] Fix anti- and semi-join for self-join with aggregations

2 matches

Site Navigation

Mail list logo

Footer information