[GitHub] [spark] cloud-fan commented on a change in pull request #31758: [SPARK-34639][SQL] Always remove unnecessary Alias in Analyzer.resolveExpression

GitBox Tue, 09 Mar 2021 00:10:12 -0800


cloud-fan commented on a change in pull request #31758:
URL: https://github.com/apache/spark/pull/31758#discussion_r590052460




##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala
##########
@@ -80,11 +80,7 @@ class RelationalGroupedDataset protected[sql](
     }
   }
 
-  // Wrap UnresolvedAttribute with UnresolvedAlias, as when we resolve 
UnresolvedAttribute, we
-  // will remove intermediate Alias for ExtractValue chain, and we need to 
alias it again to
-  // make it a NamedExpression.

Review comment:
       The comment is wrong as we don't remove top-level aliases for aggregate 
expressions. It causes problems as it wraps `UnresolvedAttribute` with 
`UnresolvedAlias`, making it not top-level anymore. Then the alias will be 
removed after this patch and `UnresolvedAlias` generates a different name.
   
   For nested field `a.b`, previously the resolved expression is 
`Alias(GetStructField(...), "b")` and the `Alias` is not removed. 
`UnresolvedAlias` is useless and will be simply removed. So the final output 
column name is `b`. Now we remove the `Alias`, and `UnresolvedAlias` kicks in 
and generates a new `Alias` with the name `a.b`, which is a behavior change.
   
   Here I simply remove this `UnresolvedAlias`, to make the behavior the same 
as before.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on a change in pull request #31758: [SPARK-34639][SQL] Always remove unnecessary Alias in Analyzer.resolveExpression

Reply via email to