[ 
https://issues.apache.org/jira/browse/SPARK-12773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094361#comment-15094361
 ] 

Rahul Tanwani commented on SPARK-12773:
---------------------------------------

[~srowen] Thanks. Before putting it here, I asked on the user mailing list, but 
did not get any reply. If you have info on the same, here is the post 
http://apache-spark-user-list.1001560.n3.nabble.com/Impurity-and-Samples-details-for-each-node-of-a-decision-tree-td25941.html.



> Impurity and Sample details for each node of a decision tree
> ------------------------------------------------------------
>
>                 Key: SPARK-12773
>                 URL: https://issues.apache.org/jira/browse/SPARK-12773
>             Project: Spark
>          Issue Type: Question
>          Components: ML, MLlib
>    Affects Versions: 1.5.2
>            Reporter: Rahul Tanwani
>
> I just want to understand if each node in the decision tree calculates / 
> stores information about no. of samples that satisfy the split criteria. 
> Looking at the code, I find some information about the impurity statistics 
> but did not find anything on the samples. Sci-kit learn exposes both of these 
> metrics. The information may help in the cases where there are multiple 
> decision rules (multiple leaf nodes) yielding the same prediction and we want 
> to do some relative comparisions of decision paths.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to