Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/19496#discussion_r145182120
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
---
@@ -2103,4 +2103,13 @@ class DataFrameSuite extends QueryTest with
SharedSQLContext {
testData2.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7,
3, 1), Row(7, 3, 2)))
}
+
+ test("SPARK-22271: mean overflows and returns null for some decimal
variables") {
+ val d = 0.034567890
+ val df = Seq(d, d, d, d, d, d, d, d, d, d).toDF("DecimalCol")
+ val result = df.select('DecimalCol cast DecimalType(38, 33))
+ .select(col("DecimalCol")).describe()
+ val mean = result.select("DecimalCol").where($"summary" === "mean")
+ assert(mean.collect.toSet ===
Set(Row("0.0345678900000000000000000000000000000")))
--- End diff --
@gatorsmile Thanks Sean for your review. I will fix the problems.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]