Github user kiszk commented on the issue: https://github.com/apache/spark/pull/20021 Here is [another example](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala#L414). This is complicated. `result` is passed to `arguments` of `ctx.splitExpressions` in `genHashForStruct`. The `result` may get `ev.value` in a certain call path that is started from [here](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala#L277) As this example points out, while `arguments` in `ctx.splitExpressions` has local variable, a local variable may have `ev.value` in some calling context. It does not seem to be easy to ensure no global variable in `arguments`. Thus, IMHO, it would be good to conbine localization and adding assert. Then, we declare a contract `no global variable in arguments` in comment. WDYT? ``` // hash.scala, line 277 override def doGenCode(ctx, ev): ExprCode = { ... computeHash(childGen.value, child.dataType, ev.value, ctx) ... } // hash.scala, line 448 protected def computeHash(input, dataType, result, ctx): String = computeHashWithTailRec(input, dataType, result, ctx) // hash.scala, line 414 private def computeHashWithTailRec(input, dataType, result, ctx): String = dataType match { ... case StructType(fields) => genHashForStruct(ctx, input, result, fields) } //hash.scala, line 402 protected def genHashForStruct(ctx, input, result, fields) { ... ctx.splitExpressions( ... arguments = Seq("InternalRow" -> input, hashResultType -> result), ... } ```
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org