Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-27 Thread Christof Angermueller
I submitted my final proposal to melange. Thanks everybody for your suggestions! Christof On 20150326 22:51, Andy wrote: I think you should focus on first creating a prototype without ParamSklearn. On 03/26/2015 06:19 PM, Christof Angermueller wrote: Hi Matthias, using HPOlib to benchmark

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-26 Thread Andreas Mueller
Hi Matthias. As far as I know, the main goal for TPE was to support tree-structured parameter spaces. I am not sure we want to go there yet because of the more complex API. On non-tree structured spaces, I think TPE performed worse than SMAC and GP. With regard to your code: There might be

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-26 Thread Christof Angermueller
Hi Andy and others, I revised my proposal (https://docs.google.com/document/d/1bAWdiu6hZ6-FhSOlhgH-7x3weTluxRfouw9op9bHBxs/edit?usp=sharing) and submitted it to melange. Can you have a look if any essential (formal) things are missing? I will submit the final version tomorrow. Cheers,

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-26 Thread Andreas Mueller
I think the class that you introduce should really be geared towards scikit-learn estimators. But there could be a lower level function that just optimizes a black-box function. That is probably desirable from a modularity standpoint and for testing anyhow. On 03/26/2015 05:07 PM, Christof

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-26 Thread Christof Angermueller
GridSearchCV and RandomizedSearchCV inherit from BaseCV and require and an estimator object with fit() and predict() as first constructor argument. Hence, the estimator must follow the sklearn convention with fit() and predict(). Instead, the estimator might also be implemented as a black-box

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-26 Thread Christof Angermueller
Hi Matthias, using HPOlib to benchmark GPSearchCV on the same datasets that were used to benchmark spearmint, TPA, and SMAC, is a good idea, and I will include it in my proposal. However, I plan to primarily compare GPSearchCV with GridSearchCV, RandomizedSearchCV, as well as spearmint as

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-26 Thread Andy
I think you should focus on first creating a prototype without ParamSklearn. On 03/26/2015 06:19 PM, Christof Angermueller wrote: Hi Matthias, using HPOlib to benchmark GPSearchCV on the same datasets that were used to benchmark spearmint, TPA, and SMAC, is a good idea, and I will include

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-26 Thread Matthias Feurer
Dear Christof, dear scikit-learn team, This is a great idea, I highly encourage your idea to integrate Bayesian Optimization into scikit-learn since automatically configuring scikit-learn is quite powerful. It was done by the three winning teams of the first automated machine learning

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-25 Thread Andreas Mueller
Testing on the global optimization problems directly will actually be a time saver, as they can be evaluated directly, without needing to compute an estimator on MNIST for each point. On 03/25/2015 03:15 PM, Gael Varoquaux wrote: I am very afraid of the time sink that this will be. Sent

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-25 Thread Andreas Mueller
I think you could bench on other problems, but maybe focus on the ones in scikit-learn. Deep learning people might be happy with using external tools for optimizing. I'd also recommend benchmarking just the global optimization part on global optimization datasets as they were used in Jasper's

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-25 Thread Gael Varoquaux
I am very afraid of the time sink that this will be. Sent from my phone. Please forgive brevity and mis spelling On Mar 25, 2015, 19:47, at 19:47, Andreas Mueller t3k...@gmail.com wrote: I think you could bench on other problems, but maybe focus on the ones in scikit-learn. Deep learning

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-25 Thread Kyle Kastner
See figure 5 of this paper: http://www.cs.ubc.ca/~hutter/papers/ICML14-HyperparameterAssessment.pdf for an example. There is a better paper that exclusively tackles this but I cannot find it at the moment. I was referring to the optimizer preferring algorithms which are both fast and give good

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-25 Thread Christof Angermueller
To which SMAC paper are you referring to? What do you mean about optimizing runtime/training time? The optimizer should find good parameters with in a short time. Do you mean comparing the best result in a predefined time frame? For this, the 'expected improvement per second' acquisition

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-25 Thread Christof Angermueller
I decided to only benchmark scikit-learn models. Doing this properly and summarizing the results in a user-friendly rst document will take some time and should be sufficient for a GSoC project. More sophistacted benchmarks could be carried out afterwards. I plan to benchmark the following

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-24 Thread Christof Angermueller
Don't you think that I could also benchmark models that are not implemented in sklearn? For instance, I could write a wrapper DeepNet(...) with fit() and predict(), and which uses internally theano to build a ANN? In this way, I could benchmark complex deep networks beyond what will be

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-24 Thread Christof Angermueller
On 20150324 21:25, Andy wrote: One thing that might also be interesting is Bootstrapping (in the compiler sense, not the statistics sense) the optimizer. The latest Jasper Snoek paper http://arxiv.org/abs/1502.05700 they used a hyper-parameter optimizer to optimize the parameter of a

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-24 Thread Gael Varoquaux
Don't you think that I could also benchmark models that are not implemented in sklearn? For instance, I could write a wrapper DeepNet(...) with fit() and predict(), and which uses internally theano to build a ANN? In this way, I could benchmark complex deep networks beyond what will be

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-24 Thread Olivier Grisel
Christof, don't forget to put your proposal on melange by Thursday (the earlier the better). Please put scikit-learn in the title to make it easy to find. -- Olivier -- Dive into the World of Parallel Programming The Go

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-24 Thread Kyle Kastner
I would focus on the API of this functionality and how/what users will be allowed to specify. To me, this is a particularly tricky bit of the PR. As Vlad said, take a close look at GridSearchCV and RandomizedSearchCV and see how they interact with the codebase. Do you plan to find good defaults

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-24 Thread Vlad Niculae
Hi Cristoph, Gael, hi everyone, On 24 Mar 2015, at 18:09, Gael Varoquaux gael.varoqu...@normalesup.org wrote: Don't you think that I could also benchmark models that are not implemented in sklearn? […] I am personally less interested in that. We have already a lot in scikit-learn and

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-24 Thread Christof Angermueller
Thanks Andy! I replied to your comments: https://docs.google.com/document/d/1bAWdiu6hZ6-FhSOlhgH-7x3weTluxRfouw9op9bHBxs/edit?usp=sharing. I summary, * I will not mentioned parallelization as an extended features, * suggest concrete data sets for benchmarking, * mentioned tasks for which I

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-24 Thread Michael Eickenberg
On Tue, Mar 24, 2015 at 9:38 PM, Christof Angermueller c.angermuel...@gmail.com wrote: Thanks Andy! I replied to your comments: https://docs.google.com/document/d/1bAWdiu6hZ6-FhSOlhgH-7x3weTluxRfouw9op9bHBxs/edit?usp=sharing . I summary, * I will not mentioned parallelization as an

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-24 Thread Andy
On 03/24/2015 04:38 PM, Christof Angermueller wrote: Thanks Andy! I replied to your comments: https://docs.google.com/document/d/1bAWdiu6hZ6-FhSOlhgH-7x3weTluxRfouw9op9bHBxs/edit?usp=sharing. I summary, * I will not mentioned parallelization as an extended features, * suggest concrete data

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-24 Thread Christof Angermueller
thanks Andy! I will revise my proposal and submit it to melange today! Christof On 20150324 00:07, Andreas Mueller wrote: Hi Christof. I gave some comments on the google doc. Andy On 03/19/2015 05:12 PM, Christof Angermueller wrote: Hi All, you can find my proposal for the

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-24 Thread Kyle Kastner
It might be nice to talk about optimizing runtime and/or training time like SMAC did in their paper. I don't see any reason we couldn't do this in sklearn, and it might be of value to users since we don't really do deep learning as Andy said. On Tue, Mar 24, 2015 at 4:52 PM, Andy t3k...@gmail.com

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-24 Thread Kyle Kastner
That said, I would think random forests would get a lot of the benefits that deep learning tasks might get, since they also have a lot of hyperparameters. Boosting tasks would be interesting as well, since swapping the estimator used could make a huge difference, though that may be trickier to

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-24 Thread Kyle Kastner
This paper (http://arxiv.org/pdf/1306.3476v1.pdf) might also give you some ideas for things to try. Boosting an untrained deep model got a lot of benefit from bayesian optimization. Note that this model was built prior to the release of the dataset! Weird but very interesting. On Tue, Mar 24,

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-24 Thread Andy
One thing that might also be interesting is Bootstrapping (in the compiler sense, not the statistics sense) the optimizer. The latest Jasper Snoek paper http://arxiv.org/abs/1502.05700 they used a hyper-parameter optimizer to optimize the parameter of a hyper-parameter optimizer on a set of

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-23 Thread Andreas Mueller
Hi Christof. Can you please also post it on melange? Reviews will be coming soon ;) Andy On 03/19/2015 05:12 PM, Christof Angermueller wrote: Hi All, you can find my proposal for the hyperparameter optimization topic here: * http://goo.gl/XHuav8 *

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-19 Thread Joel Nothman
This is off-topic, but I should note that there is a patch at https://github.com/scikit-learn/scikit-learn/pull/2784 awaiting review for a while now... On 20 March 2015 at 08:16, Charles Martin charlesmarti...@gmail.com wrote: I would like to propose extending the linearSVC package by

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-19 Thread Christof Angermueller
Hi All, you can find my proposal for the hyperparameter optimization topic here: * http://goo.gl/XHuav8 * https://docs.google.com/document/d/1bAWdiu6hZ6-FhSOlhgH-7x3weTluxRfouw9op9bHBxs/edit?usp=sharing Please give feedback! Cheers, Christof On 20150310 15:27, Sturla Molden wrote: Andreas

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-19 Thread Gael Varoquaux
Does anybody know of further optimization approaches that were not mentioned below and that we could consider? Maybe parallel computing. A grid search is an embarrassingly parallel problem. A Bayesian optimization is not. We have the necessary framework only to tackle embarrassingly parallel

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-11 Thread Christof Angermueller
I will have a closer look at the different optimization approaches and start to work on an outline for this topic. Does anybody know of further optimization approaches that were not mentioned below and that we could consider? Is there anybody else interested in this topic? Christof On

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-09 Thread Andreas Mueller
Does emcee implement Bayesian optimization? What is the distribution you assume? GPs? I thought emcee was a sampler. I need to check in with Dan ;) On 03/09/2015 09:27 AM, Sturla Molden wrote: For Bayesian optimization with MCMC (which I believe spearmint also does) I have found that emcee is

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-09 Thread Sturla Molden
For Bayesian optimization with MCMC (which I believe spearmint also does) I have found that emcee is very nice: http://dan.iel.fm/emcee/current/ It is much faster than naïve MCMC methods and all we need to do is compute a callback that computes the loglikelihood given the parameter set (which

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-09 Thread Andy
Hi Christof. I think implementing either the GP or SMAC approach would be good. I talked to Jasper Snoek on Friday, possiblity the trickiest part for the GP is the optimization of the resulting function. Spearmint also marginalizes out the hyperparameters, which our upcoming GP implementation

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-09 Thread Andreas Mueller
Yeah, I don't think we want to include that in the scope of the GSoC. Using MLE parameters still works, just converges a bit slower. On 03/09/2015 11:28 AM, Jan Hendrik Metzen wrote: A combination of emcee with GPs (in this case the GPs from george) is described here:

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-09 Thread Jan Hendrik Metzen
A combination of emcee with GPs (in this case the GPs from george) is described here: http://dan.iel.fm/george/current/user/hyper/#sampling-marginalization As PR #4270 for sklearn also exposes a method log_marginal_likelihood(theta) in GaussianProcessRegressor, it should be straight-forward to

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-09 Thread Andreas Mueller
We wanted a bot that tells us about violations on PRs. Not sure if landscape.io can provide that:\ https://github.com/scikit-learn/scikit-learn/issues/3888#issuecomment-76037183 ragv also looked into this, I think. Not necessary a binary fail/pass but more like a report by a bot. On 03/09/2015

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-07 Thread Kyle Kastner
I think finding one method is indeed the goal. Even if it is not the best every time, a 90% solution for 10% of the complexity would be awesome. I think GPs with parameter space warping are *probably* the best solution but only a good implementation will show for sure. Spearmint and hyperopt