Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22079
The test "model load / save" in ChiSqSelectorSuite fails because of this
line in
[ChiSqSelector.scala](https://github.com/apache/spark/blob/branch-2.2/mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala#L147)
<pre>
spark.createDataFrame(dataArray).repartition(1).write.parquet(Loader.dataPath(path))
</pre>
In 2.4, the line is:
<pre>
spark.createDataFrame(sc.makeRDD(dataArray,
1)).write.parquet(Loader.dataPath(path))
</pre>
If you change 2.4 to also have that line, and also remove the follow-up PR
(#20426) to avoid sorting when there is one partition, this test also fails on
2.4 in the same way.
So I am not sure which way to go: Update ChiSqSelector.scala to be like 2.4
(simply a one line change), or make the test accept this new order.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]