[
https://issues.apache.org/jira/browse/SPARK-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15225018#comment-15225018
]
Joseph K. Bradley commented on SPARK-4591:
------------------------------------------
I created subtasks for reviewing the major algorithm classes. We can later
decide how to handle the following items.
User-facing:
* Streaming ML (to be done under structured streaming in the 2.x line)
* evaluation
* fpm
* pmml
* stat
Developer-facing:
* optimization
* random, rdd
* util
Note that linalg is being handled separately: [SPARK-13944]
> Algorithm/model parity in spark.ml (Scala)
> ------------------------------------------
>
> Key: SPARK-4591
> URL: https://issues.apache.org/jira/browse/SPARK-4591
> Project: Spark
> Issue Type: Umbrella
> Components: ML
> Reporter: Xiangrui Meng
> Priority: Critical
>
> This is an umbrella JIRA for porting spark.mllib implementations to use the
> DataFrame-based API defined under spark.ml. We want to achieve feature
> parity for the next release.
> Create or link subtasks for:
> * missing algorithms or models (However, this does NOT include stats or
> linear algebra; those will be handled separately.)
> * existing algorithms or models which are missing features, params, etc.
> This only covers Scala since we can compare Scala vs. Python in spark.ml
> itself.
> _Note: Please search JIRA for existing issues to avoid duplicates._
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]