maropu commented on a change in pull request #26441: [SPARK-29682][SQL] Resolve 
conflicting references in aggregate expressions 
URL: https://github.com/apache/spark/pull/26441#discussion_r344958732
 
 

 ##########
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ##########
 @@ -949,14 +949,19 @@ class Analyzer(
             if oldVersion.outputSet.intersect(conflictingAttributes).nonEmpty 
=>
           (oldVersion, oldVersion.copy(serializer = 
oldVersion.serializer.map(_.newInstance())))
 
-        // Handle projects that create conflicting aliases.
         case oldVersion @ Project(projectList, _)
-            if 
findAliases(projectList).intersect(conflictingAttributes).nonEmpty =>
-          (oldVersion, oldVersion.copy(projectList = newAliases(projectList)))
+            if hasConflict(projectList, conflictingAttributes) =>
+          (oldVersion,
+            oldVersion.copy(
+              projectList =
+                newNamedExpression(projectList, conflictingAttributes)))
 
         case oldVersion @ Aggregate(_, aggregateExpressions, _)
 
 Review comment:
   Ur, I see... In the query you showed in the PR description, it seems the 
dedup logic doesn't work in the `Expand` node (^^^^^ below):
   ```
   'Join Inner
   :- Aggregate [nums#121, spark_grouping_id#119], [nums#121, max(0) AS 
agcol#118]
   :  +- Expand [List(nums#79, nums#120, 0), List(nums#79, null, 1)], [nums#79, 
nums#121, spark_grouping_id#119]
   :     +- Project [nums#79, nums#79 AS nums#120]
   :        +- Project [value#76 AS nums#79]
   :           +- LocalRelation [value#76]
   +- Aggregate [nums#121, spark_grouping_id#119], [nums#121, max(0) AS 
agcol#124]
                 ^^^^^^^^
      +- Expand [List(nums#79, nums#120, 0), List(nums#79, null, 1)], [nums#79, 
nums#121, spark_grouping_id#119]
                                                                                
^^^^^^^^
         +- Project [nums#79, nums#79 AS nums#120]
            +- Project [value#76 AS nums#79]
               +- LocalRelation [value#76]
   ```
   So, we might be able to fix this dedup issue by adding an entry for `Expand` 
in `dedupRight` like this?;
   ```
           case oldVersion @ Expand(_, output, _)
               if 
oldVersion.outputSet.intersect(conflictingAttributes).nonEmpty =>
             (oldVersion, oldVersion.copy(output = output.map(_.newInstance())))
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to