Rahul Tanwani created SPARK-12773:
-------------------------------------

             Summary: Impurity and Sample details for each node of a decision 
tree
                 Key: SPARK-12773
                 URL: https://issues.apache.org/jira/browse/SPARK-12773
             Project: Spark
          Issue Type: Question
          Components: ML, MLlib
    Affects Versions: 1.5.2
            Reporter: Rahul Tanwani


I just want to understand if each node in the decision tree calculates / stores 
information about no. of samples that satisfy the split criteria. Looking at 
the code, I find some information about the impurity statistics but did not 
find anything on the samples. Sci-kit learn exposes both of these metrics. The 
information may help in the cases where there are multiple decision rules 
(multiple leaf nodes) yielding the same prediction and we want to do some 
relative comparisions of decision paths.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to