[GitHub] spark pull request #22066: [SPARK-25084][SQL] "distribute by" on multiple co...

yucai Sat, 11 Aug 2018 06:39:21 -0700

Github user yucai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22066#discussion_r209426886
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala
 ---
    @@ -404,21 +404,26 @@ abstract class HashExpression[E] extends Expression {
           input: String,
           result: String,
           fields: Array[StructField]): String = {
    +    val tmpInput = ctx.freshName("input")
         val fieldsHash = fields.zipWithIndex.map { case (field, index) =>
    -      nullSafeElementHash(input, index.toString, field.nullable, 
field.dataType, result, ctx)
    +      nullSafeElementHash(tmpInput, index.toString, field.nullable, 
field.dataType, result, ctx)
         }
         val hashResultType = CodeGenerator.javaType(dataType)
    -    ctx.splitExpressions(
    +    val code = ctx.splitExpressions(
           expressions = fieldsHash,
           funcName = "computeHashForStruct",
    -      arguments = Seq("InternalRow" -> input, hashResultType -> result),
    +      arguments = Seq("InternalRow" -> tmpInput, hashResultType -> result),
           returnType = hashResultType,
           makeSplitFunction = body =>
             s"""
                |$body
                |return $result;
              """.stripMargin,
           foldFunctions = _.map(funcCall => s"$result = 
$funcCall;").mkString("\n"))
    +    s"""
    +       |final InternalRow $tmpInput = $input;
    --- End diff --
    
    Yes, very agree, we can improve this in the future.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #22066: [SPARK-25084][SQL] "distribute by" on multiple co...

Reply via email to