Hi JohnMark, SVMs, by design, are quite sensitive to the addition of single data points – but only if those data points happen to lie near the margin. I wrote about some of those types of details here: https://jakevdp.github.io/PythonDataScienceHandbook/05.07-support-vector-machines.html
Hope that helps, Jake Jake VanderPlas Senior Data Science Fellow Director of Open Software University of Washington eScience Institute On Tue, Dec 19, 2017 at 1:27 PM, Taylor, Johnmark < johnmarktay...@g.harvard.edu> wrote: > Hello, > > I am a researcher in fMRI and am using SVMs to analyze brain data. I am > doing decoding between two classes, each of which has 24 exemplars per > class. I am comparing two different methods of cross-validation for my > data: in one, I am training on 23 exemplars from each class, and testing on > the remaining example from each class, and in the other, I am training on > 22 exemplars from each class, and testing on the remaining two from each > class (in case it matters, the data is structured into different > neuroimaging "runs", with each "run" containing several "blocks"; the first > cross-validation method is leaving out one block at a time, the second is > leaving out one run at a time). > > Now, I would've thought that these two CV methods would be very similar, > since the vast majority of the training data is the same; the only > difference is in adding two additional points. However, they are yielding > very different results: training on 23 per class is yielding 60% decoding > accuracy (averaged across several subjects, and statistically significantly > greater than chance), training on 22 per class is yielding chance (50%) > decoding. Leaving aside the particulars of fMRI in this case: is it unusual > for single points (amounting to less than 5% of the data) to have such a > big influence on SVM decoding? I am using a cost parameter of C=1. I must > say it is counterintuitive to me that just a couple points out of two dozen > could make such a big difference. > > Thank you very much, and cheers, > > JohnMark > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn