Thank you. It seems that information value can only be calculated for a binary classification dataset, however my response variable is continuous.


On 20/04/17 05:51, urvesh patel wrote:
I believe your random variable by chance have some predictive power. In
R, use Information package and check information value of that randomly
created variable. If it is > 0.05 then it has good predictive power.
On Tue, Apr 18, 2017 at 7:47 AM Olga Lyashevska
<o.lyashevsk...@gmail.com <mailto:o.lyashevsk...@gmail.com>> wrote:

    Hi,

    I would like to understand how feature importances are calculated in
    gradient boosting regression.

    I know that these are the relevant functions:
    
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/ensemble/gradient_boosting.py#L1165
    
https://github.com/scikit-learn/scikit-learn/blob/fc2f24927fc37d7e42917369f17de045b14c59b5/sklearn/tree/_tree.pyx#L1056

     From the literature and elsewhere I understand that Gini impurity is
    calculated. What is this exactly and how does it relate to 'gain' vs
    'frequency' implemented in XGBoost?
    http://xgboost.readthedocs.io/en/latest/R-package/discoverYourData.html

    My problem is that when I fit exactly same model in sklearn and gbm (R
    package) I get different variable importance plots. One of the variables
    which was generated randomly (keeping all other variables real) appears
    to be very important in sklearn and very unimportant in gbm. How is this
    possible that completely random variable gets the highest importance?


    Many thanks,
    Olga
    _______________________________________________
    scikit-learn mailing list
    scikit-learn@python.org <mailto:scikit-learn@python.org>
    https://mail.python.org/mailman/listinfo/scikit-learn



_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to