Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/19394#discussion_r143317742
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala ---
@@ -58,7 +58,7 @@ class ConfigBehaviorSuite extends QueryTest with
SharedSQLContext {
withSQLConf(SQLConf.RANGE_EXCHANGE_SAMPLE_SIZE_PER_PARTITION.key ->
"1") {
// If we only sample one point, the range boundaries will be
pretty bad and the
// chi-sq value would be very high.
- assert(computeChiSquareTest() > 1000)
+ assert(computeChiSquareTest() > 300)
--- End diff --
Updating the SparkPlan code to avoid creating unnecessary RDDs fixed this
test because the same random seed is used. But, I think we should keep this
change so we don't have to track down this problem again in the future. This
bound is perfectly safe because the balanced chi-sq value is 10, while the
unbalanced is much larger.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]