[ 
https://issues.apache.org/jira/browse/SPARK-6113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14349811#comment-14349811
 ] 

Joseph K. Bradley commented on SPARK-6113:
------------------------------------------

Pinging [~MechCoder] since you've been working on tree ensembles.  Before long, 
I hope to start refactoring the tree and ensemble APIs, which will require a 
little coordination.  Here's what I'm planning:
1. I'll make a PR with the new API.  It will use but not touch the existing 
tree & ensemble code.
2. Merge or close existing PRs towards the old API.
3. I'll make a PR moving the code to the new API, making the old API a wrapper. 
(No new PRs should be made at this time.)
4. Any new PRs will be made against the new API.

Note in the design doc that the new and old APIs will be in different 
namespaces:
* old: mllib.tree.*
* new: mllib.classification.* and mllib.regression.*


> Stabilize DecisionTree and ensembles APIs
> -----------------------------------------
>
>                 Key: SPARK-6113
>                 URL: https://issues.apache.org/jira/browse/SPARK-6113
>             Project: Spark
>          Issue Type: Sub-task
>          Components: MLlib, PySpark
>    Affects Versions: 1.4.0
>            Reporter: Joseph K. Bradley
>            Assignee: Joseph K. Bradley
>            Priority: Critical
>
> *Issue*: The APIs for DecisionTree and ensembles (RandomForests and 
> GradientBoostedTrees) have been experimental for a long time.  The API has 
> become very convoluted because trees and ensembles have many, many variants, 
> some of which we have added incrementally without a long-term design.
> *Proposal*: This JIRA is for discussing changes required to finalize the 
> APIs.  After we discuss, I will make a PR to update the APIs and make them 
> non-Experimental.  This will require making many breaking changes; see the 
> design doc for details.
> [Design doc | 
> https://docs.google.com/document/d/1rJ_DZinyDG3PkYkAKSsQlY0QgCeefn4hUv7GsPkzBP4]:
>  This outlines current issues and the proposed API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to