metanil commented on code in PR #55885:
URL: https://github.com/apache/spark/pull/55885#discussion_r3270141243


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala:
##########
@@ -552,7 +552,9 @@ object KeyedPartitioning {
 
   def supportsExpressions(expressions: Seq[Expression]): Boolean = {
     def isSupportedTransform(transform: TransformExpression): Boolean = {
-      transform.children.size == 1 && isReference(transform.children.head)
+      // TransformExpression.collectLeaves() only returns column references, 
not literals.
+      // We need exactly one column reference per transform.
+      transform.collectLeaves().size == 1

Review Comment:
   Thanks @sunchao  for the review. Good point. I didn't think about nested 
transform. This is indeed widening the transforms, as `collectLeaves()` go 
through all the `children` in nested expressions. 
   Like you mentioned, adding direct-reference constraints here would be 
enough. I'll make that change by getting non literal children, and use same 
existing check (size and reference). With that we also don't have to then worry 
about `TransformExpression.isSameFunction` comparing only outer function.
   
   With regards to adding support to nested transform for SPJ, very interesting 
idea, but we can take that as separate PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to