[
https://issues.apache.org/jira/browse/SPARK-1357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13963999#comment-13963999
]
Sean Owen commented on SPARK-1357:
----------------------------------
I know I'm late to this party, but I just had a look and wanted to throw out a
few last minute ideas.
(Do you not want to just declare all of MLlib experimental? is it really 1.0?
that's a fairly significant set of shackles to put on for a long time.)
OK, that aside, I have two suggestions to mark as experimental:
1. ALS Rating object assumes users and items are Int. I suggest it will be
eventually interesting to support String, or at least switch to Long.
2. Per old MLLIB-29, I feel pretty certain that ClassificationModel can't
return RDD[Double], and will want to support returning a distribution over
labels at some point. Similarly the input to it and RegressionModel seems like
it will have to change to encompass something more than Vector to properly
allow for categorical values. DecisionTreeModel has the same issue but is
experimental (and doesn't integrate with these APIs?)
The point is not so much whether one agrees with these, but whether there is a
non-trivial chance of wanting to change something this year.
Other parts that I'm interested in personally look pretty strong. Humbly
submitted.
> [MLLIB] Annotate developer and experimental API's
> -------------------------------------------------
>
> Key: SPARK-1357
> URL: https://issues.apache.org/jira/browse/SPARK-1357
> Project: Spark
> Issue Type: Sub-task
> Components: MLlib
> Affects Versions: 1.0.0
> Reporter: Patrick Wendell
> Assignee: Xiangrui Meng
> Fix For: 1.0.0
>
>
--
This message was sent by Atlassian JIRA
(v6.2#6252)