Le 18 avril 2012 07:35, Andreas Mueller <[email protected]> a écrit : > Am 18.04.2012 16:25, schrieb Mathieu Blondel: >> Hello, >> >> Recently, I had to learn a classifier with 3 hyperparameters (each >> taking several possible values) and an exhaustive grid search was >> really expensive. However, in some situations, some hyperparameters >> are more important than others. For example, for a SVC with RBF >> kernel, my experience is that choosing gamma (fixing C to 1.0) first, >> then C (fixing gamma to the previous selected value) works pretty >> well. Would such a non-exhaustive grid search be useful in scikit-learn? >> >> In terms of API, I would add a "search_order" option to GridSearchCV. >> For example, if param_grid is as follows: >> >> param_grid = { "gamma": [0.01, 0.1, 1.0, 10.0], "C": [1, 0.1, 10, 100, >> 1000] } >> >> then, search_order could be: >> >> search_order = ["gamma", "C"]. >> >> The algorithm would first choose gamma given C=1.0 (as it is the first >> in the list), then choose C given the best gamma. More generally. >> len(search_order) doesn't need to be equal to len(param_grid) if only >> some parameters must be fixed. >> >> If I'm not mistaken, this reduces the number of grid points from >> \prod_i n_values_i to \sum_i n_values_i. >> >> I'm pretty sure this search procedure breaks in situations where >> hyperparameters are very inter-dependent but I think it could still be >> useful in some situations. Anyway, we really need more alternatives to >> exhaustive grid search (c.f., James Bergstra's work). >> > I would rather go for sampling possible parameter sets. There was some > preliminary PR on that somewhere... > Not sure how well-suited this is for rbf-svm though.
I think mathieu's proposal makes sense for rbf-svm. Random Sampling a-la Bergstra would be useful too but probably more interesting for models with 3+ hyperparams. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ Better than sec? Nothing is better than sec when it comes to monitoring Big Data applications. Try Boundary one-second resolution app monitoring today. Free. http://p.sf.net/sfu/Boundary-dev2dev _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
