Joseph K. Bradley created SPARK-7131:
----------------------------------------
Summary: Move tree,forest implementation from spark.mllib to
spark.ml
Key: SPARK-7131
URL: https://issues.apache.org/jira/browse/SPARK-7131
Project: Spark
Issue Type: Improvement
Components: ML, MLlib
Affects Versions: 1.4.0
Reporter: Joseph K. Bradley
We want to change and improve the spark.ml API for trees and ensembles, but we
cannot change the old API in spark.mllib. To support the changes we want to
make, we should move the implementation from spark.mllib to spark.ml. We will
generalize and modify it, but will also ensure that we do not change the
behavior of the old API.
This JIRA should be done in several PRs, in this order:
1. Copy the implementation over to spark.ml and change the spark.ml classes to
use that implementation, rather than calling the spark.mllib implementation.
The current spark.ml tests will ensure that the 2 implementations learn exactly
the same models.
2. Remove the spark.mllib implementation, and make the spark.mllib APIs
wrappers around the spark.ml implementation. The spark.ml tests will again
ensure that we do not change any behavior.
3. Move the unit tests to spark.ml, and change the spark.mllib unit tests to
verify model equivalence.
After these updates, we can more safely generalize and improve the spark.ml
implementation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]