I submitted my final proposal to melange.
Thanks everybody for your suggestions!
Christof
On 20150326 22:51, Andy wrote:
I think you should focus on first creating a prototype without
ParamSklearn.
On 03/26/2015 06:19 PM, Christof Angermueller wrote:
Hi Matthias,
using HPOlib to benchmark
I think you should focus on first creating a prototype without ParamSklearn.
On 03/26/2015 06:19 PM, Christof Angermueller wrote:
Hi Matthias,
using HPOlib to benchmark GPSearchCV on the same datasets that were
used to benchmark spearmint, TPA, and SMAC, is a good idea, and I will
include it
Hi Matthias,
using HPOlib to benchmark GPSearchCV on the same datasets that were used
to benchmark spearmint, TPA, and SMAC, is a good idea, and I will
include it in my proposal. However, I plan to primarily compare
GPSearchCV with GridSearchCV, RandomizedSearchCV, as well as spearmint
as on
I think the class that you introduce should really be geared towards
scikit-learn estimators.
But there could be a "lower level" function that just optimizes a
black-box function.
That is probably desirable from a modularity standpoint and for testing
anyhow.
On 03/26/2015 05:07 PM, Christof
GridSearchCV and RandomizedSearchCV inherit from BaseCV and require
and an estimator object with fit() and predict() as first constructor
argument. Hence, the estimator must follow the sklearn convention with
fit() and predict(). Instead, the estimator might also be implemented as
a black-box
Hi Andy and others,
I revised my proposal
(https://docs.google.com/document/d/1bAWdiu6hZ6-FhSOlhgH-7x3weTluxRfouw9op9bHBxs/edit?usp=sharing)
and submitted it to melange. Can you have a look if any essential
(formal) things are missing?
I will submit the final version tomorrow.
Cheers,
Christ
Hi Matthias.
As far as I know, the main goal for TPE was to support tree-structured
parameter spaces. I am not sure we want to go there yet because of the
more complex API.
On non-tree structured spaces, I think TPE performed worse than SMAC and GP.
With regard to your code: There might be tou
Dear Christof, dear scikit-learn team,
This is a great idea, I highly encourage your idea to integrate Bayesian
Optimization into scikit-learn since automatically configuring
scikit-learn is quite powerful. It was done by the three winning teams
of the first automated machine learning competit
As I said, I think at least for developing purposes I think it might
help you to also compare on the global optimization problems that
Jasper is reporting on in the deep neural net paper. That is probably
not for the docs, though.
I think the list below is good. Having some pipelines might also b
I decided to only benchmark scikit-learn models. Doing this properly and
summarizing the results in a user-friendly rst document will take some
time and should be sufficient for a GSoC project. More sophistacted
benchmarks could be carried out afterwards.
I plan to benchmark the following mode
See figure 5 of this paper:
http://www.cs.ubc.ca/~hutter/papers/ICML14-HyperparameterAssessment.pdf
for an example.
There is a better paper that exclusively tackles this but I cannot
find it at the moment.
I was referring to the optimizer preferring algorithms which are both
fast and give good pe
To which SMAC paper are you referring to?
What do you mean about optimizing runtime/training time? The optimizer
should find good parameters with in a short time. Do you mean comparing
the best result in a predefined time frame? For this, the 'expected
improvement per second' acquisition functio
Testing on the global optimization problems directly will actually be a
time saver,
as they can be evaluated directly, without needing to compute an
estimator on MNIST for each point.
On 03/25/2015 03:15 PM, Gael Varoquaux wrote:
I am very afraid of the time sink that this will be.
Sent fro
I am very afraid of the time sink that this will be.
Sent from my phone. Please forgive brevity and mis spelling
On Mar 25, 2015, 19:47, at 19:47, Andreas Mueller wrote:
>I think you could bench on other problems, but maybe focus on the ones
>in scikit-learn.
>Deep learning people might be h
I think you could bench on other problems, but maybe focus on the ones
in scikit-learn.
Deep learning people might be happy with using external tools for
optimizing.
I'd also recommend benchmarking just the global optimization part on
global optimization datasets as they were used in Jasper's wor
I would focus on the API of this functionality and how/what users will
be allowed to specify. To me, this is a particularly tricky bit of the
PR. As Vlad said, take a close look at GridSearchCV and
RandomizedSearchCV and see how they interact with the codebase. Do you
plan to find good defaults for
Hi Cristoph, Gael, hi everyone,
> On 24 Mar 2015, at 18:09, Gael Varoquaux
> wrote:
>
>> Don't you think that I could also benchmark models that are not
>> implemented in sklearn? […]
>
> I am personally less interested in that. We have already a lot in
> scikit-learn and more than enough to
Christof, don't forget to put your proposal on melange by Thursday
(the earlier the better). Please put "scikit-learn" in the title to
make it easy to find.
--
Olivier
--
Dive into the World of Parallel Programming The G
> Don't you think that I could also benchmark models that are not
> implemented in sklearn? For instance, I could write a wrapper
> DeepNet(...) with fit() and predict(), and which uses internally theano
> to build a ANN? In this way, I could benchmark complex deep networks
> beyond what will b
On 20150324 21:25, Andy wrote:
> One thing that might also be interesting is "Bootstrapping" (in the
> compiler sense, not the statistics sense) the optimizer.
> The latest Jasper Snoek paper http://arxiv.org/abs/1502.05700 they used
> a hyper-parameter optimizer to optimize the parameter
> of a
Don't you think that I could also benchmark models that are not
implemented in sklearn? For instance, I could write a wrapper
DeepNet(...) with fit() and predict(), and which uses internally theano
to build a ANN? In this way, I could benchmark complex deep networks
beyond what will be possible
One thing that might also be interesting is "Bootstrapping" (in the
compiler sense, not the statistics sense) the optimizer.
The latest Jasper Snoek paper http://arxiv.org/abs/1502.05700 they used
a hyper-parameter optimizer to optimize the parameter
of a hyper-parameter optimizer on a set of opt
This paper (http://arxiv.org/pdf/1306.3476v1.pdf) might also give you
some ideas for things to try. Boosting an untrained "deep" model got a
lot of benefit from bayesian optimization. Note that this model was
built prior to the release of the dataset! Weird but very interesting.
On Tue, Mar 24, 20
That said, I would think random forests would get a lot of the
benefits that deep learning tasks might get, since they also have a
lot of hyperparameters. Boosting tasks would be interesting as well,
since swapping the estimator used could make a huge difference, though
that may be trickier to impl
It might be nice to talk about optimizing runtime and/or training time
like SMAC did in their paper. I don't see any reason we couldn't do
this in sklearn, and it might be of value to users since we don't
really do deep learning as Andy said.
On Tue, Mar 24, 2015 at 4:52 PM, Andy wrote:
> On 03/2
On 03/24/2015 04:38 PM, Christof Angermueller wrote:
> Thanks Andy! I replied to your comments:
> https://docs.google.com/document/d/1bAWdiu6hZ6-FhSOlhgH-7x3weTluxRfouw9op9bHBxs/edit?usp=sharing.
>
> I summary,
> * I will not mentioned parallelization as an extended features,
> * suggest concrete d
On Tue, Mar 24, 2015 at 9:38 PM, Christof Angermueller <
c.angermuel...@gmail.com> wrote:
> Thanks Andy! I replied to your comments:
>
> https://docs.google.com/document/d/1bAWdiu6hZ6-FhSOlhgH-7x3weTluxRfouw9op9bHBxs/edit?usp=sharing
> .
>
> I summary,
> * I will not mentioned parallelization as a
Thanks Andy! I replied to your comments:
https://docs.google.com/document/d/1bAWdiu6hZ6-FhSOlhgH-7x3weTluxRfouw9op9bHBxs/edit?usp=sharing.
I summary,
* I will not mentioned parallelization as an extended features,
* suggest concrete data sets for benchmarking,
* mentioned tasks for which I expect
thanks Andy! I will revise my proposal and submit it to melange today!
Christof
On 20150324 00:07, Andreas Mueller wrote:
> Hi Christof.
> I gave some comments on the google doc.
>
> Andy
>
> On 03/19/2015 05:12 PM, Christof Angermueller wrote:
>> Hi All,
>>
>> you can find my proposal for the hy
Hi Christof.
I gave some comments on the google doc.
Andy
On 03/19/2015 05:12 PM, Christof Angermueller wrote:
> Hi All,
>
> you can find my proposal for the hyperparameter optimization topic here:
> * http://goo.gl/XHuav8
> *
> https://docs.google.com/document/d/1bAWdiu6hZ6-FhSOlhgH-7x3weTluxRfo
Hi Christof.
Can you please also post it on melange?
Reviews will be coming soon ;)
Andy
On 03/19/2015 05:12 PM, Christof Angermueller wrote:
> Hi All,
>
> you can find my proposal for the hyperparameter optimization topic here:
> * http://goo.gl/XHuav8
> *
> https://docs.google.com/document/d/1bA
This is off-topic, but I should note that there is a patch at
https://github.com/scikit-learn/scikit-learn/pull/2784 awaiting review for
a while now...
On 20 March 2015 at 08:16, Charles Martin wrote:
> I would like to propose extending the linearSVC package
> by replacing the liblinear version
> Does anybody know of further optimization approaches that were not
> mentioned below and that we could consider?
Maybe parallel computing. A grid search is an embarrassingly parallel
problem. A Bayesian optimization is not. We have the necessary framework
only to tackle embarrassingly parallel
Hi All,
you can find my proposal for the hyperparameter optimization topic here:
* http://goo.gl/XHuav8
*
https://docs.google.com/document/d/1bAWdiu6hZ6-FhSOlhgH-7x3weTluxRfouw9op9bHBxs/edit?usp=sharing
Please give feedback!
Cheers,
Christof
On 20150310 15:27, Sturla Molden wrote:
> Andreas M
I would like to propose extending the linearSVC package
by replacing the liblinear version with a newer version that
1. allows setting instance weights
2. provides the dual variables /Lagrange multipliers
This would facilitate research and development of transductive SVMs
and related semi-supervi
I will have a closer look at the different optimization approaches and
start to work on an outline for this topic.
Does anybody know of further optimization approaches that were not
mentioned below and that we could consider?
Is there anybody else interested in this topic?
Christof
On 201503
Andreas Mueller wrote:
> Does emcee implement Bayesian optimization?
> What is the distribution you assume? GPs?
> I thought emcee was a sampler. I need to check in with Dan ;)
Just pick the mode :-)
The distribution is whatever you want it to be.
Sturla
>
>
> On 03/09/2015 09:27 AM, Stur
On 03/09/2015 07:11 PM, Saket Choudhary wrote:
> On 9 March 2015 at 15:42, Andreas Mueller wrote:
>> We wanted a bot that tells us about violations on PRs.
>> Not sure if landscape.io can provide that:\
>> https://github.com/scikit-learn/scikit-learn/issues/3888#issuecomment-76037183
>>
>> ragv al
On 9 March 2015 at 15:42, Andreas Mueller wrote:
> We wanted a bot that tells us about violations on PRs.
> Not sure if landscape.io can provide that:\
> https://github.com/scikit-learn/scikit-learn/issues/3888#issuecomment-76037183
>
> ragv also looked into this, I think.
> Not necessary a binary
We wanted a bot that tells us about violations on PRs.
Not sure if landscape.io can provide that:\
https://github.com/scikit-learn/scikit-learn/issues/3888#issuecomment-76037183
ragv also looked into this, I think.
Not necessary a binary "fail/pass" but more like a report by a bot.
On 03/09/201
@andreas
i dont know if this is already a thing - but how about a (soft?)
requirement that new pull reqs are pep8 compliant (within reason)
On Mon, Mar 9, 2015 at 6:26 PM, Andreas Mueller wrote:
>
> On 03/09/2015 05:24 PM, Christof Angermueller wrote:
> > Is there currently an open issue, such
On 03/09/2015 05:24 PM, Christof Angermueller wrote:
> Is there currently an open issue, such that I can submit some patches?
> #4354?
ragv might be working on this, not sure. I promised work but didn't
deliver ;)
You mean like general open issues? We have 341 of those:
https://github.com/scikit-
I agree with Kyle: an efficient, easy to use hyperparameter optimization
module that is consistent with the sklearn framework would be an
advantage over existing packages. In term of efficiency, I would start
with ML estimation or variational inference instead of (slower) sampling.
I will read
Yeah, I don't think we want to include that in the scope of the GSoC.
Using MLE parameters still works, just converges a bit slower.
On 03/09/2015 11:28 AM, Jan Hendrik Metzen wrote:
> A combination of emcee with GPs (in this case the GPs from george) is
> described here:
> http://dan.iel.fm/georg
A combination of emcee with GPs (in this case the GPs from george) is
described here:
http://dan.iel.fm/george/current/user/hyper/#sampling-marginalization
As PR #4270 for sklearn also exposes a method
log_marginal_likelihood(theta) in GaussianProcessRegressor, it should be
straight-forward to adap
Does emcee implement Bayesian optimization?
What is the distribution you assume? GPs?
I thought emcee was a sampler. I need to check in with Dan ;)
On 03/09/2015 09:27 AM, Sturla Molden wrote:
> For Bayesian optimization with MCMC (which I believe spearmint also
> does) I have found that emcee is
For Bayesian optimization with MCMC (which I believe spearmint also
does) I have found that emcee is very nice:
http://dan.iel.fm/emcee/current/
It is much faster than naïve MCMC methods and all we need to do is
compute a callback that computes the loglikelihood given the parameter
set (which
Hi Christof.
I think implementing either the GP or SMAC approach would be good.
I talked to Jasper Snoek on Friday, possiblity the trickiest part for
the GP is the optimization of the resulting function.
Spearmint also marginalizes out the hyperparameters, which our upcoming
GP implementation d
I think finding one method is indeed the goal. Even if it is not the best
every time, a 90% solution for 10% of the complexity would be awesome. I
think GPs with parameter space warping are *probably* the best solution but
only a good implementation will show for sure.
Spearmint and hyperopt exist
49 matches
Mail list logo