GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/295
Recursive Partitioning: Add function to report importance scores JIRA: MADLIB-925 This commit adds a new MADlib function (get_var_importance) to report the importance scores in decision tree and random forest. RF models prior to MADlib 1.15 used to have variable importance scores reported, but they also have impurity variable importance from 1.15 onwards. This function reports both those scores for >=1.15 RF models, and only the oob variable importance score for <1.15 RF models. This function when called for a DT model, would return the impurity variable importance score for >=1.15 DT models. Co-authored-by: Jingyi Mei <j...@pivotal.io> Co-authored-by: Orhan Kislal <okis...@pivotal.io> You can merge this pull request into a Git repository by running: $ git pull https://github.com/madlib/madlib feature/output-importance Alternatively you can review and apply these changes as the patch at: https://github.com/apache/madlib/pull/295.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #295 ---- commit 54a4a17915f6ce1ddea6260db2d06fcd0ee50f51 Author: Nandish Jayaram <njayaram@...> Date: 2018-07-03T19:22:07Z Recursive Partitioning: Add function to report importance scores JIRA: MADLIB-925 This commit adds a new MADlib function (get_var_importance) to report the importance scores in decision tree and random forest. RF models prior to MADlib 1.15 used to have variable importance scores reported, but they also have impurity variable importance from 1.15 onwards. This function reports both those scores for >=1.15 RF models, and only the oob variable importance score for <1.15 RF models. This function when called for a DT model, would return the impurity variable importance score for >=1.15 DT models. Co-authored-by: Jingyi Mei <j...@pivotal.io> Co-authored-by: Orhan Kislal <okis...@pivotal.io> ---- ---