I submitted my final proposal to melange.
Thanks everybody for your suggestions!
Christof
On 20150326 22:51, Andy wrote:
I think you should focus on first creating a prototype without
ParamSklearn.
On 03/26/2015 06:19 PM, Christof Angermueller wrote:
Hi Matthias,
using HPOlib to benchmark
Hi Matthias.
As far as I know, the main goal for TPE was to support tree-structured
parameter spaces. I am not sure we want to go there yet because of the
more complex API.
On non-tree structured spaces, I think TPE performed worse than SMAC and GP.
With regard to your code: There might be
Hi Andy and others,
I revised my proposal
(https://docs.google.com/document/d/1bAWdiu6hZ6-FhSOlhgH-7x3weTluxRfouw9op9bHBxs/edit?usp=sharing)
and submitted it to melange. Can you have a look if any essential
(formal) things are missing?
I will submit the final version tomorrow.
Cheers,
I think the class that you introduce should really be geared towards
scikit-learn estimators.
But there could be a lower level function that just optimizes a
black-box function.
That is probably desirable from a modularity standpoint and for testing
anyhow.
On 03/26/2015 05:07 PM, Christof
GridSearchCV and RandomizedSearchCV inherit from BaseCV and require
and an estimator object with fit() and predict() as first constructor
argument. Hence, the estimator must follow the sklearn convention with
fit() and predict(). Instead, the estimator might also be implemented as
a black-box
Hi Matthias,
using HPOlib to benchmark GPSearchCV on the same datasets that were used
to benchmark spearmint, TPA, and SMAC, is a good idea, and I will
include it in my proposal. However, I plan to primarily compare
GPSearchCV with GridSearchCV, RandomizedSearchCV, as well as spearmint
as
I think you should focus on first creating a prototype without ParamSklearn.
On 03/26/2015 06:19 PM, Christof Angermueller wrote:
Hi Matthias,
using HPOlib to benchmark GPSearchCV on the same datasets that were
used to benchmark spearmint, TPA, and SMAC, is a good idea, and I will
include
Dear Christof, dear scikit-learn team,
This is a great idea, I highly encourage your idea to integrate Bayesian
Optimization into scikit-learn since automatically configuring
scikit-learn is quite powerful. It was done by the three winning teams
of the first automated machine learning
Testing on the global optimization problems directly will actually be a
time saver,
as they can be evaluated directly, without needing to compute an
estimator on MNIST for each point.
On 03/25/2015 03:15 PM, Gael Varoquaux wrote:
I am very afraid of the time sink that this will be.
Sent
I think you could bench on other problems, but maybe focus on the ones
in scikit-learn.
Deep learning people might be happy with using external tools for
optimizing.
I'd also recommend benchmarking just the global optimization part on
global optimization datasets as they were used in Jasper's
I am very afraid of the time sink that this will be.
Sent from my phone. Please forgive brevity and mis spelling
On Mar 25, 2015, 19:47, at 19:47, Andreas Mueller t3k...@gmail.com wrote:
I think you could bench on other problems, but maybe focus on the ones
in scikit-learn.
Deep learning
See figure 5 of this paper:
http://www.cs.ubc.ca/~hutter/papers/ICML14-HyperparameterAssessment.pdf
for an example.
There is a better paper that exclusively tackles this but I cannot
find it at the moment.
I was referring to the optimizer preferring algorithms which are both
fast and give good
To which SMAC paper are you referring to?
What do you mean about optimizing runtime/training time? The optimizer
should find good parameters with in a short time. Do you mean comparing
the best result in a predefined time frame? For this, the 'expected
improvement per second' acquisition
I decided to only benchmark scikit-learn models. Doing this properly and
summarizing the results in a user-friendly rst document will take some
time and should be sufficient for a GSoC project. More sophistacted
benchmarks could be carried out afterwards.
I plan to benchmark the following
Don't you think that I could also benchmark models that are not
implemented in sklearn? For instance, I could write a wrapper
DeepNet(...) with fit() and predict(), and which uses internally theano
to build a ANN? In this way, I could benchmark complex deep networks
beyond what will be
On 20150324 21:25, Andy wrote:
One thing that might also be interesting is Bootstrapping (in the
compiler sense, not the statistics sense) the optimizer.
The latest Jasper Snoek paper http://arxiv.org/abs/1502.05700 they used
a hyper-parameter optimizer to optimize the parameter
of a
Don't you think that I could also benchmark models that are not
implemented in sklearn? For instance, I could write a wrapper
DeepNet(...) with fit() and predict(), and which uses internally theano
to build a ANN? In this way, I could benchmark complex deep networks
beyond what will be
Christof, don't forget to put your proposal on melange by Thursday
(the earlier the better). Please put scikit-learn in the title to
make it easy to find.
--
Olivier
--
Dive into the World of Parallel Programming The Go
I would focus on the API of this functionality and how/what users will
be allowed to specify. To me, this is a particularly tricky bit of the
PR. As Vlad said, take a close look at GridSearchCV and
RandomizedSearchCV and see how they interact with the codebase. Do you
plan to find good defaults
Hi Cristoph, Gael, hi everyone,
On 24 Mar 2015, at 18:09, Gael Varoquaux gael.varoqu...@normalesup.org
wrote:
Don't you think that I could also benchmark models that are not
implemented in sklearn? […]
I am personally less interested in that. We have already a lot in
scikit-learn and
Thanks Andy! I replied to your comments:
https://docs.google.com/document/d/1bAWdiu6hZ6-FhSOlhgH-7x3weTluxRfouw9op9bHBxs/edit?usp=sharing.
I summary,
* I will not mentioned parallelization as an extended features,
* suggest concrete data sets for benchmarking,
* mentioned tasks for which I
On Tue, Mar 24, 2015 at 9:38 PM, Christof Angermueller
c.angermuel...@gmail.com wrote:
Thanks Andy! I replied to your comments:
https://docs.google.com/document/d/1bAWdiu6hZ6-FhSOlhgH-7x3weTluxRfouw9op9bHBxs/edit?usp=sharing
.
I summary,
* I will not mentioned parallelization as an
On 03/24/2015 04:38 PM, Christof Angermueller wrote:
Thanks Andy! I replied to your comments:
https://docs.google.com/document/d/1bAWdiu6hZ6-FhSOlhgH-7x3weTluxRfouw9op9bHBxs/edit?usp=sharing.
I summary,
* I will not mentioned parallelization as an extended features,
* suggest concrete data
thanks Andy! I will revise my proposal and submit it to melange today!
Christof
On 20150324 00:07, Andreas Mueller wrote:
Hi Christof.
I gave some comments on the google doc.
Andy
On 03/19/2015 05:12 PM, Christof Angermueller wrote:
Hi All,
you can find my proposal for the
It might be nice to talk about optimizing runtime and/or training time
like SMAC did in their paper. I don't see any reason we couldn't do
this in sklearn, and it might be of value to users since we don't
really do deep learning as Andy said.
On Tue, Mar 24, 2015 at 4:52 PM, Andy t3k...@gmail.com
That said, I would think random forests would get a lot of the
benefits that deep learning tasks might get, since they also have a
lot of hyperparameters. Boosting tasks would be interesting as well,
since swapping the estimator used could make a huge difference, though
that may be trickier to
This paper (http://arxiv.org/pdf/1306.3476v1.pdf) might also give you
some ideas for things to try. Boosting an untrained deep model got a
lot of benefit from bayesian optimization. Note that this model was
built prior to the release of the dataset! Weird but very interesting.
On Tue, Mar 24,
One thing that might also be interesting is Bootstrapping (in the
compiler sense, not the statistics sense) the optimizer.
The latest Jasper Snoek paper http://arxiv.org/abs/1502.05700 they used
a hyper-parameter optimizer to optimize the parameter
of a hyper-parameter optimizer on a set of
Hi Christof.
Can you please also post it on melange?
Reviews will be coming soon ;)
Andy
On 03/19/2015 05:12 PM, Christof Angermueller wrote:
Hi All,
you can find my proposal for the hyperparameter optimization topic here:
* http://goo.gl/XHuav8
*
This is off-topic, but I should note that there is a patch at
https://github.com/scikit-learn/scikit-learn/pull/2784 awaiting review for
a while now...
On 20 March 2015 at 08:16, Charles Martin charlesmarti...@gmail.com wrote:
I would like to propose extending the linearSVC package
by
Hi All,
you can find my proposal for the hyperparameter optimization topic here:
* http://goo.gl/XHuav8
*
https://docs.google.com/document/d/1bAWdiu6hZ6-FhSOlhgH-7x3weTluxRfouw9op9bHBxs/edit?usp=sharing
Please give feedback!
Cheers,
Christof
On 20150310 15:27, Sturla Molden wrote:
Andreas
Does anybody know of further optimization approaches that were not
mentioned below and that we could consider?
Maybe parallel computing. A grid search is an embarrassingly parallel
problem. A Bayesian optimization is not. We have the necessary framework
only to tackle embarrassingly parallel
I will have a closer look at the different optimization approaches and
start to work on an outline for this topic.
Does anybody know of further optimization approaches that were not
mentioned below and that we could consider?
Is there anybody else interested in this topic?
Christof
On
Does emcee implement Bayesian optimization?
What is the distribution you assume? GPs?
I thought emcee was a sampler. I need to check in with Dan ;)
On 03/09/2015 09:27 AM, Sturla Molden wrote:
For Bayesian optimization with MCMC (which I believe spearmint also
does) I have found that emcee is
For Bayesian optimization with MCMC (which I believe spearmint also
does) I have found that emcee is very nice:
http://dan.iel.fm/emcee/current/
It is much faster than naïve MCMC methods and all we need to do is
compute a callback that computes the loglikelihood given the parameter
set (which
Hi Christof.
I think implementing either the GP or SMAC approach would be good.
I talked to Jasper Snoek on Friday, possiblity the trickiest part for
the GP is the optimization of the resulting function.
Spearmint also marginalizes out the hyperparameters, which our upcoming
GP implementation
Yeah, I don't think we want to include that in the scope of the GSoC.
Using MLE parameters still works, just converges a bit slower.
On 03/09/2015 11:28 AM, Jan Hendrik Metzen wrote:
A combination of emcee with GPs (in this case the GPs from george) is
described here:
A combination of emcee with GPs (in this case the GPs from george) is
described here:
http://dan.iel.fm/george/current/user/hyper/#sampling-marginalization
As PR #4270 for sklearn also exposes a method
log_marginal_likelihood(theta) in GaussianProcessRegressor, it should be
straight-forward to
We wanted a bot that tells us about violations on PRs.
Not sure if landscape.io can provide that:\
https://github.com/scikit-learn/scikit-learn/issues/3888#issuecomment-76037183
ragv also looked into this, I think.
Not necessary a binary fail/pass but more like a report by a bot.
On 03/09/2015
I think finding one method is indeed the goal. Even if it is not the best
every time, a 90% solution for 10% of the complexity would be awesome. I
think GPs with parameter space warping are *probably* the best solution but
only a good implementation will show for sure.
Spearmint and hyperopt
40 matches
Mail list logo