[
https://issues.apache.org/jira/browse/SPARK-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiangrui Meng updated SPARK-2478:
---------------------------------
Priority: Critical (was: Major)
> Add Python APIs for decision tree
> ---------------------------------
>
> Key: SPARK-2478
> URL: https://issues.apache.org/jira/browse/SPARK-2478
> Project: Spark
> Issue Type: New Feature
> Components: MLlib, PySpark
> Reporter: Xiangrui Meng
> Assignee: Joseph K. Bradley
> Priority: Critical
>
> In v1.0, we only support decision tree in Scala/Java. It would be nice to add
> Python support. It may require some refactoring of the current decision tree
> API to make it easier to construct a decision tree algorithm in Python.
> 1. Simplify decision tree constructors such that only simple types are used.
> a. Hide the implementation of Impurity from users.
> b. Replace enums by strings.
> 2. Make separate public decision tree classes for regression & classification
> (with shared internals). Eliminate algo parameter.
> 3. Implement wrappers in Python for DecisionTree.
> 4. Implement wrappers in Python for DecisionTreeModel.
--
This message was sent by Atlassian JIRA
(v6.2#6252)