date:20170420

Re: [scikit-learn] feature importance calculation in gradient boosting

2017-04-20 Thread Olga Lyashevska

Thank you. It seems that information value can only be calculated for a binary classification dataset, however my response variable is continuous. On 20/04/17 05:51, urvesh patel wrote: I believe your random variable by chance have some predictive power. In R, use Information package and chec

Re: [scikit-learn] sklearn - knn sklearn.neighbors kneighbors function producing unexpected result for text analysis?

2017-04-20 Thread Alex Garel

I'm not totally sure of what you're trying to do, but here are some remarks that may help you: 1. in modelfit = model.fit(count_vect, enc), the enc parameter is not used, only the count_vect matrix is used 2. when you use kneighbors you get vectors corresponding to wiki['text'] not to wiki['name']

Re: [scikit-learn] sklearn - knn sklearn.neighbors kneighbors function producing unexpected result for text analysis?

2017-04-20 Thread Joel Nothman

The problem is the misuse of the label encoder. See https://github.com/scikit-learn/scikit-learn/issues/8767 On 20 April 2017 at 19:58, Alex Garel wrote: > I'm not totally sure of what you're trying to do, but here are some > remarks that may help you: > > 1. in modelfit = model.fit(count_vect,