i have a situation i cannot quite comprehend. any suggestions would be much
appreciated.
---
clf = sk.svm.SVC(kernel='linear', shrinking=True)
anova_filter = SelectKBest(f_regression, k=50)
clf = Pipeline([('anova', anova_filter), ('svc', clf)])
maxkfold = np.min(np.bincount(y1)[1:])
result = []
for train, test in cv.StratifiedKFold(y1, maxkfold):
result.append((y1[test],
clf.fit(X1[train], y1[train]).predict(X1[test])))
---
with k=50 in the above anova filter i get the following confusion matrix
[[ 0 7]
[ 0 10]]
with k=51 and greater, i get ( i love this - but i don't have much
confidence in it):
[[7 0]
[1 9]]
if i throw a Normalizer in the pipeline:
clf = Pipeline([('xfm', Normalizer()), ('anova', anova_filter), ('svc', clf)])
then i get the pattern with k=50 above independent of what i set k to.
if i do shufflesplit it results in the same issues.
cheers,
satra
------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general