2012/9/4 Andreas Mueller <[email protected]>:
> On 09/04/2012 03:43 PM, Olivier Grisel wrote:
>> 2012/9/4 Andreas Mueller <[email protected]>:
>>> Hi everybody.
>>> I'm pretty new to feature selection stuff and I tried to use the chi2
>>> selection.
>>> I got a pvalue of exactly zero on one of the features and one of e-250
>>> on another one.
>>> That seems a bit fishy, in particular as they don't seem to correlate
>>> very strongly.
>>> Maybe I misunderstood something.
>>> Any hints?
>> Can you put a gist with a reproduction script + data subset?
>>
> Here:
> https://gist.github.com/3621861

There are typos in the script:

import numpy as np
from sklearn.feature_selection import chi2
data = np.load("values.npy")
labels = np.load("labels.npy")
print(chi2(data, labels))

Here is the output:

(array([  2004.40215852,  14210.70616566,    101.8947577 ,     54.13746251,
           49.02911194,   2029.54198255,     67.61064593,
564.51314541]), array([  0.00000000e+000,   0.00000000e+000,
5.85512658e-024,
         1.86942888e-013,   2.52191591e-012,   0.00000000e+000,
         1.99189131e-016,   8.76728404e-125]))



-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to