[ 
https://issues.apache.org/jira/browse/SPARK-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph K. Bradley updated SPARK-2478:
-------------------------------------

    Description: 
In v1.0, we only support decision tree in Scala/Java. It would be nice to add 
Python support. It may require some refactoring of the current decision tree 
API to make it easier to construct a decision tree algorithm in Python.

1. Simplify decision tree constructors such that only simple types are used.
  a. Hide the implementation of Impurity from users.
  b. Replace enums by strings.
2. Make separate public decision tree classes for regression & classification 
(with shared internals).  Eliminate algo parameter.
3. Implement wrappers in Python for DecisionTree.
4. Implement wrappers in Python for DecisionTreeModel.

  was:
In v1.0, we only support decision tree in Scala/Java. It would be nice to add 
Python support. It may require some refactoring of the current decision tree 
API to make it easier to construct a decision tree algorithm in Python.

1. Simplify decision tree constructors such that only simple types are used.
  a. Hide the implementation of Impurity from users.
  b. Replace enums by strings.
2. Implement wrappers in Python for DecisionTree.
3. Implement wrappers in Python for DecisionTreeModel.


> Add Python APIs for decision tree
> ---------------------------------
>
>                 Key: SPARK-2478
>                 URL: https://issues.apache.org/jira/browse/SPARK-2478
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib, PySpark
>            Reporter: Xiangrui Meng
>            Assignee: Joseph K. Bradley
>
> In v1.0, we only support decision tree in Scala/Java. It would be nice to add 
> Python support. It may require some refactoring of the current decision tree 
> API to make it easier to construct a decision tree algorithm in Python.
> 1. Simplify decision tree constructors such that only simple types are used.
>   a. Hide the implementation of Impurity from users.
>   b. Replace enums by strings.
> 2. Make separate public decision tree classes for regression & classification 
> (with shared internals).  Eliminate algo parameter.
> 3. Implement wrappers in Python for DecisionTree.
> 4. Implement wrappers in Python for DecisionTreeModel.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to