Re: [Scikit-learn-general] RBF SVC performance depends on platform

Andy Fri, 03 Oct 2014 09:14:46 -0700

On 10/01/2014 04:23 PM, Gavin Hackeling wrote:

Hi all,
I am working on an character recognition problem with the Chars74Kdata set. I am reshaping the images to 30x30 pixels, and using the 900pixels' intensities as features. I am classifying the images using aSVC with an RBF kernel.
...
    pipeline = Pipeline([
        ('clf', SVC(kernel='rbf'))
    ])
    parameters = {
        'clf__gamma': (0.01, 0.03, 0.1, 0.3, 1),
        'clf__C': (0.1, 0.3, 1, 3, 10, 30),
    }
...
On CrunchBang 11 with scikit-learn 0.15.2, grid search yields thefollowing results:
Fitting 3 folds for each of 30 candidates, totalling 90 fits
[Parallel(n_jobs=3)]: Done   1 jobs       | elapsed:  1.6min
[Parallel(n_jobs=3)]: Done  50 jobs       | elapsed: 34.8min
[Parallel(n_jobs=3)]: Done 86 out of 90 | elapsed: 69.4minremaining: 3.2min
[Parallel(n_jobs=3)]: Done  90 out of  90 | elapsed: 71.6min finished
Best score: 0.559
Best parameters set:
clf__C: 3
clf__gamma: 0.03
             precision    recall  f1-score   support

        001       0.00      0.00      0.00         6
        002       1.00      0.20      0.33         5
        ...
        061       0.00      0.00      0.00         4
        062       0.00      0.00      0.00         4

avg / total       0.56      0.58      0.53       532
On Ubuntu 14.04 and OS X with scikit-learn 0.15.1 and 0.15.2, the samemodel performs horribly. The following are the results of the scriptfor Ubuntu 14.04 with NumPy 1.8.2 and 0.14.0.
avg / total       0.09      0.07      0.02       532
Switching to a polynomial kernel on these platforms yields betterperformance, but the RBF kernel still performs best.
It appears that the performance depends on the platform. What might bethe problem here?

Have you fixed the random seed in the GridSearchCV? The dataset seemsmuch to small for this number of classes, and the results of the crossvalidation will be very noisy.If you look at the "good" result, the performance is 0 for all classesthat are visible but class number 002, and that one only has 5 samples.Another reason of non-determinism could be if you use"probabilities=True" in the SVC.

------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] RBF SVC performance depends on platform

Reply via email to