2012/9/4 Andreas Mueller <[email protected]>:
> On 09/04/2012 03:43 PM, Olivier Grisel wrote:
>> 2012/9/4 Andreas Mueller <[email protected]>:
>>> Hi everybody.
>>> I'm pretty new to feature selection stuff and I tried to use the chi2
>>> selection.
>>> I got a pvalue of exactly zero on one of the features and one of e-250
>>> on another one.
>>> That seems a bit fishy, in particular as they don't seem to correlate
>>> very strongly.
>>> Maybe I misunderstood something.
>>> Any hints?
>> Can you put a gist with a reproduction script + data subset?
>>
> Here:
> https://gist.github.com/3621861
There are typos in the script:
import numpy as np
from sklearn.feature_selection import chi2
data = np.load("values.npy")
labels = np.load("labels.npy")
print(chi2(data, labels))
Here is the output:
(array([ 2004.40215852, 14210.70616566, 101.8947577 , 54.13746251,
49.02911194, 2029.54198255, 67.61064593,
564.51314541]), array([ 0.00000000e+000, 0.00000000e+000,
5.85512658e-024,
1.86942888e-013, 2.52191591e-012, 0.00000000e+000,
1.99189131e-016, 8.76728404e-125]))
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general