Github user srowen commented on the issue:
https://github.com/apache/spark/pull/16129
@felixcheung maybe you can advise me on this. I think this is a correct
fix, but ends up changing the results of decision forests a little bit. The
SparkR test you wrote fails:
```
Failed
-------------------------------------------------------------------------
1. Failure: spark.randomForest (@test_mllib.R#937)
-----------------------------
predictions$prediction not equal to c(...).
16/16 mismatches (average diff: 0.108)
[1] 60.3 - 60.4 == -0.0508
[2] 61.2 - 61.1 == 0.1272
[3] 60.7 - 60.6 == 0.0543
[4] 62.1 - 62.3 == -0.1473
[5] 63.5 - 63.7 == -0.2044
[6] 64.1 - 64.3 == -0.2413
[7] 65.1 - 64.9 == 0.2591
[8] 64.3 - 64.3 == 0.0045
[9] 66.7 - 66.7 == 0.0001
...
```
Of course I can just paste in the new values, as I expect a small change in
the result, but wanted to sense-check it. The new answers are closer to the
answers in the nearly-identical case above with 1 tree, which seems a little
positive.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]