On 10/01/2014 04:23 PM, Gavin Hackeling wrote:
Hi all,
I am working on an character recognition problem with the Chars74K
data set. I am reshaping the images to 30x30 pixels, and using the 900
pixels' intensities as features. I am classifying the images using a
SVC with an RBF kernel.
...
pipeline = Pipeline([
('clf', SVC(kernel='rbf'))
])
parameters = {
'clf__gamma': (0.01, 0.03, 0.1, 0.3, 1),
'clf__C': (0.1, 0.3, 1, 3, 10, 30),
}
...
On CrunchBang 11 with scikit-learn 0.15.2, grid search yields the
following results:
Fitting 3 folds for each of 30 candidates, totalling 90 fits
[Parallel(n_jobs=3)]: Done 1 jobs | elapsed: 1.6min
[Parallel(n_jobs=3)]: Done 50 jobs | elapsed: 34.8min
[Parallel(n_jobs=3)]: Done 86 out of 90 | elapsed: 69.4min
remaining: 3.2min
[Parallel(n_jobs=3)]: Done 90 out of 90 | elapsed: 71.6min finished
Best score: 0.559
Best parameters set:
clf__C: 3
clf__gamma: 0.03
precision recall f1-score support
001 0.00 0.00 0.00 6
002 1.00 0.20 0.33 5
...
061 0.00 0.00 0.00 4
062 0.00 0.00 0.00 4
avg / total 0.56 0.58 0.53 532
On Ubuntu 14.04 and OS X with scikit-learn 0.15.1 and 0.15.2, the same
model performs horribly. The following are the results of the script
for Ubuntu 14.04 with NumPy 1.8.2 and 0.14.0.
avg / total 0.09 0.07 0.02 532
Switching to a polynomial kernel on these platforms yields better
performance, but the RBF kernel still performs best.
It appears that the performance depends on the platform. What might be
the problem here?
Have you fixed the random seed in the GridSearchCV? The dataset seems
much to small for this number of classes, and the results of the cross
validation will be very noisy.
If you look at the "good" result, the performance is 0 for all classes
that are visible but class number 002, and that one only has 5 samples.
Another reason of non-determinism could be if you use
"probabilities=True" in the SVC.
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general