Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/19563#discussion_r147145637
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/HashExpressionsSuite.scala
---
@@ -639,6 +639,63 @@ class HashExpressionsSuite extends SparkFunSuite with
ExpressionEvalHelper {
assert(hiveHashPlan(wideRow).getInt(0) == hiveHashEval)
}
+ test("SPARK-22284: Compute hash for nested structs") {
+ val M = 80
+ val N = 10
+ val L = M * N
+ val O = 50
+ val seed = 42
+
+ val wideRow1 = new GenericInternalRow(Seq.tabulate(O)(j =>
+ new GenericInternalRow(Seq.tabulate(L)(i =>
+ new GenericInternalRow(Array[Any](
+ UTF8String.fromString((j * L + i).toString))))
+ .toArray[Any])).toArray[Any])
+ var inner1 = new StructType()
--- End diff --
what about avoiding the usage of `var` here and in the other places by
passing a `Seq` of fields in the constructor?
The fields may be created using range generation and `map` instead of `for`
loops.
I think in this way we would be more compliant to general functional Scala
style, what do you think?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]