[
https://issues.apache.org/jira/browse/SPARK-5133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14614812#comment-14614812
]
Sean Owen commented on SPARK-5133:
----------------------------------
We don't generally Assign JIRAs while they're being worked on. Anyone's free to
work on anything, with an encouragement to please try to coordinate with anyone
else working on it. (Look for an open PR or comments; there is none linked here
though.)
(Also, JIRA has a problem now wherein we can't add new Contributors, and unless
you're added to that group you can't be Assigned :( )
> Feature Importance for Decision Tree (Ensembles)
> ------------------------------------------------
>
> Key: SPARK-5133
> URL: https://issues.apache.org/jira/browse/SPARK-5133
> Project: Spark
> Issue Type: New Feature
> Components: ML, MLlib
> Reporter: Peter Prettenhofer
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> Add feature importance to decision tree model and tree ensemble models.
> If people are interested in this feature I could implement it given a mentor
> (API decisions, etc). Please find a description of the feature below:
> Decision trees intrinsically perform feature selection by selecting
> appropriate split points. This information can be used to assess the relative
> importance of a feature.
> Relative feature importance gives valuable insight into a decision tree or
> tree ensemble and can even be used for feature selection.
> More information on feature importance (via decrease in impurity) can be
> found in ESLII (10.13.1) or here [1].
> R's randomForest package uses a different technique for assessing variable
> importance that is based on permutation tests.
> All necessary information to create relative importance scores should be
> available in the tree representation (class Node; split, impurity gain,
> (weighted) nr of samples?).
> [1]
> http://scikit-learn.org/stable/modules/ensemble.html#feature-importance-evaluation
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]