Did you fix the random seeds across implementations as well? Differences in seeds or generators might explain this.
Thanks, Michael J. Bommarito II, CEO Bommarito Consulting, LLC *Web:* http://www.bommaritollc.com *Mobile:* +1 (646) 450-3387 On Wed, Jun 22, 2016 at 1:15 PM, Taylor, Johnmark < johnmarktay...@g.harvard.edu> wrote: > Hello, > > I am moving much of my neuroimaging coding over to Python from Matlab and > so I am switching from using libsvm in Matlab to using Scikit-learn SVM in > Python. Just to make sure I am not changing anything substantive about my > analyses, I am experimenting with the two implementations and trying to see > whether I can get them to yield identical results. > > In Python I am using: > > clf = svm.SVC(kernel='linear',C=1,probability=True) > > In Matlab (libsvm) I am using: > > clf = libsvmtrain(svm_training_labels,svm_training_vectors,['-t 0 -b 1 -c 1']) > > When I run the SVM using these two different ways using simulated data, I > get subtly different results, even though I have fixed all of the > parameters of the SVMs to be the same using input arguments (linear > classifier, C=1, use probability estimates), and even though all the other > default parameters seem to be the same across these functions (tolerance = > .001, both using shrinking heuristics by default). > > To give more details regarding the simulations: > > One simulation I ran was designed to be absurdly difficult--it yielded 40% > accuracy for Matlab libsvm, and 44% accuracy for scikit-learn svm (binary > classification, chance = 50%). In this simulation, the two SVMs agreed in > their predictions only 18% of the time (in other words, they were both not > only guessing below chance, but they nearly always gave opposite guesses > compared to each other). > > The other simulation was easier, yielding 68% accuracy for Matlab libsvm, > and 67% accuracy for scikit-learn SVM. In this simulation, the two SVMs > agreed in their predictions 97% of the time. So even though they often got > it wrong, they tended to make the same wrong guesses. > > Any idea of what could possibly be leading to differences in the results? > My understanding is that SKL uses libsvm under the hood, so it's a been > confusing why the decoders are behaving differently. Both analyses are > being run on the same computer (Linux OS). > > Thank you very much, > > JohnMark Taylor > > PhD Student, Harvard Vision Sciences Lab > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn