Rahul Iyer created MADLIB-1205:
----------------------------------

             Summary: Add gini importance to DT and RF
                 Key: MADLIB-1205
                 URL: https://issues.apache.org/jira/browse/MADLIB-1205
             Project: Apache MADlib
          Issue Type: New Feature
          Components: Module: Decision Tree, Module: Random Forest
            Reporter: Rahul Iyer


>From the Breiman resource that we use for random forest:
{quote}Gini importance
{quote}
{quote}Every time a split of a node is made on variable m the gini impurity 
criterion for the two descendent nodes is less than the parent node. Adding up 
the gini decreases for each individual variable over all trees in the forest 
gives a fast variable importance that is often very consistent with the 
permutation importance measure.
{quote}
We can add a similar measure in our DT and RF code and distinguish this from 
our permuted importance metric by calling the current metric as 
{{oob_variable_importance}} and this new metric as 
{{impurity_variable_importance}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to