Re: [Scikit-learn-general] Why does polynomial SVC perform so poorly on document classification?

Olivier Grisel Mon, 13 Feb 2012 12:53:52 -0800

2012/2/13 Lars Buitinck <[email protected]>:
> Hi all,
>
> After reading some papers on (approximate) polynomial kernels for NLP
> applications, I got curious and decided to do some quick experiments.
> I modified the 20 newsgroups example to benchmark vanilla SVC instead
> of LinearSVC with linear, quadratic and cubic kernels. I was quite
> surprised at the results.
>
> For reference, LinearSVC(C=1000, loss=l2, penalty=l2) obtains an
> F1-score of 0.896 on the default set of four document classes.
>
> I replaced this with
>
>    params = {'C': [.01, .1, 1, 10, 100, 1000]}
>    GridSearchCV(SVC(kernel='linear'), params, score_func=metrics.f1_score)


I don't know for the polynomial kernel part but since C is scale
according to the number of sample, C=1e4 or more is required for text
classification.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Why does polynomial SVC perform so poorly on document classification?

Reply via email to