On Wed, Mar 21, 2012 at 6:46 AM, Olivier Grisel <[email protected]> wrote: > Le 21 mars 2012 11:14, Mathieu Blondel <[email protected]> a écrit : >> On Mon, Mar 19, 2012 at 1:22 AM, Andreas <[email protected]> wrote: >> >>> Are there any other options? >> >> Another solution is to perform cross-validation using non-scaled C >> values, select the best one and scale it before refitting with the >> entire dataset (to take into account that the entire dataset is bigger >> than a train split). >> Injecting estimator-specific code in GridSearchCV would be dirty so a >> SVCCV class could be added. Note that, in my opinion, such a class >> should be added anyway: currently the grid search throws away the >> kernel cache even though it could be reused across folds (unless the >> parameter is a kernel one). Reusing kernel cache makes it hard to >> parallelize the grid search but I wouldn't be surprised if a >> sequential approach with shared kernel cache is faster than a parallel >> approach with separate kernel cache. > > I am pretty sure that warm restarting the support vectors active set > would help too if we are to compute a regularization path. > Unfortunately I don't think the public C++ API of libsvm makes that > easy / possible... >
Also, isn't the feature normalization supposed to be done on a fold-by-fold basis? If you're doing that, you have a different kernel matrix in every fold anyway. - James ------------------------------------------------------------------------------ This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
