[scikit-learn] Gradient Boosting: Feature Importances do not sum to 1

Douglas Chan Tue, 30 Aug 2016 22:30:43 -0700

Hello everyone,

I notice conditions when Feature Importance values do not add up to 1 in 
ensemble tree methods, like Gradient Boosting Trees or AdaBoost Trees.  I 
wonder if there’s a bug in the code.


This error occurs when the ensemble has a large number of estimators.  The 
exact conditions depend variously.  For example, the error shows up sooner with 
a smaller amount of training samples.  Or, if the depth of the tree is large.  

When this error appears, the predicted value seems to have converged.  But it’s 
unclear if the error is causing the predicted value not to change with more 
estimators.  In fact, the feature importance sum goes lower and lower with more 
estimators thereafter.

I wonder if we’re hitting some floating point calculation error. 

Looking forward to hear your thoughts on this.

Thank you!
-Doug

_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

[scikit-learn] Gradient Boosting: Feature Importances do not sum to 1

Reply via email to