Dear scikit-learn team,
After reading the proposal of Christoph Angermüller wanting to enhance
scikit-learn with Bayesian optimization
(http://sourceforge.net/p/scikit-learn/mailman/message/33630274/) as a
GSoC project, you might also want to think again about the integration
of a hyperparameter concept into scikit-learn.
Our group built a framework called ParamSklearn
(https://bitbucket.org/mfeurer/paramsklearn/overview), which provides
hyperparameter definitions for a subset of classifiers, regressors and
preprocessors in scikit-learn. The result is something similar like what
James Bergstra did in hpsklearn
(https://github.com/hyperopt/hyperopt-sklearn) and a post from 2010
(http://sourceforge.net/p/scikit-learn/mailman/scikit-learn-general/thread/aanlktilvznvavqr-sbiixcguwyuf6jyq_ijvytdx7...@mail.gmail.com/?page=0).
In the end you get a configuration space which can then be read by a
Sequential Model-based Optimization package. For example, we used this
module for our AutoSklearn entry in the first automated machine learning
competition: https://sites.google.com/a/chalearn.org/automl/
Optimizing hyperparameters is a challenge itself, but defining relevant
ranges is also a difficult task for non-experts. Thus, it would be nice
to find a way to integrate the hyperparameter definitions into
scikit-learn (see bottom of this e-mail for a suggestion) such that they
can be used either by the not-yet-existing GPSearchCV, the already
existing RandomizedSearchCV or the GridSearchCV, but also by external
tools like our ParamSklearn. The hyperparameter definitions would leave
a user with only two mandatory choices: number of evaluations/runtime
and the estimator to use.
What do you think?
Best regards,
Matthias Feurer
Currently, we define the hyperparameters with a package called
HPOlibConfigSpace (https://github.com/automl/HPOlibConfigSpace). For the
SVC it looks like this:
C = UniformFloatHyperparameter("C", 0.03125, 32768, log=True, default=1.0)
kernel = CategoricalHyperparameter(name="kernel",
choices=["rbf", "poly", "sigmoid"], default="rbf")
degree = UniformIntegerHyperparameter("degree", 1, 5, default=3)
gamma = UniformFloatHyperparameter("gamma", 3.0517578125e-05, 8,
log=True, default=0.1)
coef0 = UniformFloatHyperparameter("coef0", -1, 1, default=0)
shrinking = CategoricalHyperparameter("shrinking", ["True", "False"],
default="True")
tol = UniformFloatHyperparameter("tol", 1e-5, 1e-1, default=1e-4,
log=True)
class_weight = CategoricalHyperparameter("class_weight",
["None", "auto"],default="None")
max_iter = UnParametrizedHyperparameter("max_iter", -1)
cs = ConfigurationSpace()
cs.add_hyperparameter(C)
cs.add_hyperparameter(kernel)
cs.add_hyperparameter(degree)
cs.add_hyperparameter(gamma)
cs.add_hyperparameter(coef0)
cs.add_hyperparameter(shrinking)
cs.add_hyperparameter(tol)
cs.add_hyperparameter(class_weight)
cs.add_hyperparameter(max_iter)
degree_depends_on_poly = EqualsCondition(degree, kernel, "poly")
coef0_condition = InCondition(coef0, kernel, ["poly", "sigmoid"])
cs.add_condition(degree_depends_on_poly)
cs.add_condition(coef0_condition)
The code is more verbose than it has to be, but we are working on this.
The ConfigurationSpace object can then be accessed by a @staticmethod
and be used as a parameter description object inside *SearchCV. We can
provide a stripped-down version of the HPOlibConfigSpace for integration
in sklearn.external, as well as the hyperparameter definitions we have
so far.
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general