Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

Sturla Molden Mon, 09 Mar 2015 06:30:41 -0700

For Bayesian optimization with MCMC (which I believe spearmint also 
does) I have found that emcee is very nice:


http://dan.iel.fm/emcee/current/

It is much faster than naïve MCMC methods and all we need to do is 
compute a callback that computes the loglikelihood given the parameter 
set (which can just as well be hyperparameters).

To do this computation in parallel one can simply evaluate the walkers 
in parallel and do a barrier synchronization after each step. The 
contention due to the barrier can be reduced by increasing the number of 
walkers as needed. Also one should use something like DCMT for random 
numbers to make sure there are no contention for the PRNG and to ensure 
that each thread (or process) gets an independent stream of random numbers.

emcee implements this kind of optimization using multiprocessing, but it 
passes parameter sets around using pickle and is therefore not very 
efficient compared to just storing the current parameter for each walker 
in shared memory. So there is a lot of room for improvement here.


Sturla



On 07/03/15 15:06, Kyle Kastner wrote:
> I think finding one method is indeed the goal. Even if it is not the
> best every time, a 90% solution for 10% of the complexity would be
> awesome. I think GPs with parameter space warping are *probably* the
> best solution but only a good implementation will show for sure.
>
> Spearmint and hyperopt exist and work for more complex stuff but with
> far more moving parts and complexity. Having a tool which is easy to use
> as the grid search and random search modules currently are would be a
> big benefit.
>
> My .02c
>
> Kyle
>
> On Mar 7, 2015 7:48 AM, "Christof Angermueller"
> <[email protected]
> <mailto:[email protected]>> wrote:
>
>     Hi Andreas (and others),
>
>     I am a PhD student in Bioinformatics at the University of Cambridge,
>     (EBI/EMBL), supervised by Oliver Stegle and Zoubin Ghahramani. In my
>     PhD, I apply and develop different machine learning algorithms for
>     analyzing biological data.
>
>     There are different approaches for hyperparameter optimization, some
>     of which you mentioned on the topics page:
>     * Sequential Model-Based Global Optimization (SMBO) ->
>     http://www.cs.ubc.ca/labs/beta/Projects/SMAC/
>     * Gaussian Processes (GP) -> Spearmint;
>     https://github.com/JasperSnoek/spearmint
>     * Tree-structured Parzen Estimator Approach (TPE) -> Hyperopt:
>     http://hyperopt.github.io/hyperopt/
>
>     And more recent approaches based on neural networks:
>     * Deep Networks for Global Optimization (DNGO) ->
>     http://arxiv.org/abs/1502.05700
>
>     The idea is to implement ONE of this approaches, right?
>
>     Do you prefer a particular approach due to theoretical or practical
>     reasons?
>
>     Spearmint also supports distributing jobs on a cluster (SGE). I
>     imagine that this requires platform specific code, which could be
>     difficult to maintain. What do you think?
>
>     Spearmint and hyperopt are already established python packages.
>     Another sklearn implementation might be considered as redundant, are
>     hard to establish. Do you have a particular new feature in mind?
>
>
>     Cheers,
>     Christof
>
>     --
>     Christof Angermueller
>     [email protected]  <mailto:[email protected]>
>     http://cangermueller.com
>
>
>     
> ------------------------------------------------------------------------------
>     Dive into the World of Parallel Programming The Go Parallel Website,
>     sponsored
>     by Intel and developed in partnership with Slashdot Media, is your
>     hub for all
>     things parallel software development, from weekly thought leadership
>     blogs to
>     news, videos, case studies, tutorials and more. Take a look and join the
>     conversation now. http://goparallel.sourceforge.net/
>     _______________________________________________
>     Scikit-learn-general mailing list
>     [email protected]
>     <mailto:[email protected]>
>     https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website, sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for all
> things parallel software development, from weekly thought leadership blogs to
> news, videos, case studies, tutorials and more. Take a look and join the
> conversation now. http://goparallel.sourceforge.net/
>
>
>



------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

Reply via email to