Le 18 avril 2012 07:35, Andreas Mueller <[email protected]> a écrit :
> Am 18.04.2012 16:25, schrieb Mathieu Blondel:
>> Hello,
>>
>> Recently, I had to learn a classifier with 3 hyperparameters (each
>> taking several possible values)  and an exhaustive grid search was
>> really expensive. However, in some situations, some hyperparameters
>> are more important than others. For example, for a SVC with RBF
>> kernel, my experience is that choosing gamma (fixing C to 1.0) first,
>> then C (fixing gamma to the previous selected value) works pretty
>> well. Would such a non-exhaustive grid search be useful in scikit-learn?
>>
>> In terms of API, I would add a "search_order" option to GridSearchCV.
>> For example, if param_grid is as follows:
>>
>> param_grid = { "gamma": [0.01, 0.1, 1.0, 10.0], "C": [1, 0.1, 10, 100,
>> 1000] }
>>
>> then, search_order could be:
>>
>> search_order = ["gamma", "C"].
>>
>> The algorithm would first choose gamma given C=1.0 (as it is the first
>> in the list), then choose C given the best gamma. More generally.
>> len(search_order) doesn't need to be equal to len(param_grid) if only
>> some parameters must be fixed.
>>
>> If I'm not mistaken, this reduces the number of grid points from
>> \prod_i n_values_i to \sum_i n_values_i.
>>
>> I'm pretty sure this search procedure breaks in situations where
>> hyperparameters are very inter-dependent but I think it could still be
>> useful in some situations. Anyway, we really need more alternatives to
>> exhaustive grid search (c.f., James Bergstra's work).
>>
> I would rather go for sampling possible parameter sets. There was some
> preliminary PR on that somewhere...
> Not sure how well-suited this is for rbf-svm though.

I think mathieu's proposal makes sense for rbf-svm. Random Sampling
a-la Bergstra would be useful too but probably more interesting for
models with 3+ hyperparams.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications. Try Boundary one-second 
resolution app monitoring today. Free.
http://p.sf.net/sfu/Boundary-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to