sunchao commented on code in PR #55885:
URL: https://github.com/apache/spark/pull/55885#discussion_r3470467090


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala:
##########
@@ -641,19 +640,15 @@ object KeyedPartitioning {
 
   def supportsExpressions(expressions: Seq[Expression]): Boolean = {
     def isSupportedTransform(transform: TransformExpression): Boolean = {
-      transform.children.size == 1 && isReference(transform.children.head)
-    }
-
-    @tailrec
-    def isReference(e: Expression): Boolean = e match {
-      case _: Attribute => true
-      case g: GetStructField => isReference(g.child)
-      case _ => false
+      // Should only consider column references, not literals.
+      val nonLiteralChildren = 
transform.children.filterNot(_.isInstanceOf[Literal])
+      // We need exactly one column reference per transform.
+      nonLiteralChildren.size == 1 && 
TransformExpression.isColumnRef(nonLiteralChildren.head)

Review Comment:
   [P1] Apply bound-function input casts before evaluating parameterized 
transforms through SPJ
   
   This gate now admits parameterized transforms whose bound function requests 
legal implicit casts. `BoundFunction.inputTypes()` explicitly allows types to 
differ from those passed to `bind`, with Spark responsible for casting. 
However, the scan path stores the raw Catalyst children in 
`TransformExpression`, and the identity-vs-transform reducer later binds and 
directly evaluates that expression without running Analyzer type coercion.
   
   For a concrete case, a connector can report `truncate(str_col, 
Expressions.literal(2.toShort))`; binding `(StringType, ShortType)` to a scalar 
function that declares `(StringType, IntegerType)` is legal. This transform 
passes the new gate, but the synthetic reducer evaluates the raw `Short` 
literal through `ApplyFunctionExpression`. Its 
`SpecificInternalRow(IntegerType)` then attempts to cast the boxed `Short` to 
`Int`, raising `ClassCastException`. Before parameterized transforms were 
admitted here, this query would fall back to a shuffle.
   
   Please coerce the transform children to `function.inputTypes()` before 
direct reducer evaluation (or reject mismatched types from this SPJ path), and 
add an identity-vs-parameterized-transform regression covering an implicitly 
cast literal.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to