Github user juliuszsompolski commented on a diff in the pull request:
https://github.com/apache/spark/pull/21133#discussion_r184343803
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala
---
@@ -279,4 +282,10 @@ class ApproximatePercentileQuerySuite extends
QueryTest with SharedSQLContext {
checkAnswer(query, expected)
}
}
+
+ test("SPARK-24013: unneeded compress can cause performance issues with
sorted input") {
+ failAfter(20 seconds) {
+ assert(sql("select approx_percentile(id, array(0.1)) from
range(10000000)").count() == 1)
--- End diff --
When you do .count(), column pruning removes the approx_percentile from the
query, so the test does not execute approx_percentile.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]