Re: [Scikit-learn-general] SVC documentation inaccuracy

James Bergstra Wed, 21 Mar 2012 11:36:10 -0700

On Wed, Mar 21, 2012 at 6:46 AM, Olivier Grisel
<[email protected]> wrote:
> Le 21 mars 2012 11:14, Mathieu Blondel <[email protected]> a écrit :
>> On Mon, Mar 19, 2012 at 1:22 AM, Andreas <[email protected]> wrote:
>>
>>> Are there any other options?
>>
>> Another solution is to perform cross-validation using non-scaled C
>> values, select the best one and scale it before refitting with the
>> entire dataset (to take into account that the entire dataset is bigger
>> than a train split).
>> Injecting estimator-specific code in GridSearchCV would be dirty so a
>> SVCCV class could be added. Note that, in my opinion, such a class
>> should be added anyway: currently the grid search throws away the
>> kernel cache even though it could be reused across folds (unless the
>> parameter is a kernel one). Reusing kernel cache makes it hard to
>> parallelize the grid search but I wouldn't be surprised if a
>> sequential approach with shared kernel cache is faster than a parallel
>> approach with separate kernel cache.
>
> I am pretty sure that warm restarting the support vectors active set
> would help too if we are to compute a regularization path.
> Unfortunately I don't think the public C++ API of libsvm makes that
> easy / possible...
>


Also, isn't the feature normalization supposed to be done on a
fold-by-fold basis? If you're doing that, you have a different kernel
matrix in every fold anyway.

- James

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] SVC documentation inaccuracy

Reply via email to