Re: [Scikit-learn-general] Project/PR Idea for Faster Automated Model Search

Pedro Rodriguez Wed, 27 Jan 2016 12:16:43 -0800

Thanks for response Andy,

The main thing I wanted to get out of asking was:
1. Is this a reasonable thing to try?
2. Has it been done before?


I would want to make it scikit-learn compatible, but having it be a PR
isn't my main goal (only a possible plus). Looks like this might be
interesting to do and comparing against auto-sklearn would give me an idea
of how well it does against a well thought out/tuned take on automated
model search.

The iterations assumption is fair, it also assumes that an iteration across
algorithms is equivalent which isn't true. That would be something else I
would be interested in looking at (use time budget instead)

Pedro

On Wed, Jan 27, 2016 at 12:34 PM, Andreas Mueller <t3k...@gmail.com> wrote:

> Hi.
> Also check out this:
> https://github.com/scikit-learn/scikit-learn/pull/5491
>
> auto-sklearn (which uses meta-learning) might also be of interest to you.
>
> From your description TuPAQ seems to assume that there is some notion of
> iterations.
> That is true only for some models. It might be easier to run models on
> subsets of the data.
> That's actually something data robot does to screen models faster.
>
> I don't think Tupaq is ready for inclusion in scikit-learn (way too fresh,
> 2 citations?).
> But if you want to create a scikit-learn compatible implementation, please
> go ahead, that would be great to have for reference.
>
> cheers,
> Andy
>
>
>
> On 01/27/2016 12:01 PM, Pedro Rodriguez wrote:
>
> Hi,
>
> I am considering working on a project which would result in a PR to
> scikit-learn, but would like to check that something like it doesn't
> already exist or is in progress (in our out of SKLearn).
>
> Goal: Implement the algorithm (TuPAQ) described here:
> http://web.cs.ucla.edu/~ameet/tupaq_socc.pdf to make something similar to
> GridSearchCV
>
> Result: Potentially much faster training time over the parameter/model
> space than GridSearchCV
>
> Description of Algorithm:
> 1. Train all models by some number of iterations to kick start
> 2. Drop out all models that are not within some margin of the best model
> 3. Repeat steps 1 and 2 based on some heuristic
> 4. Return best model
>
> Existing Code:
> Didn't find anything in SKLearn like this, closest thing I found was this:
> <https://github.com/hyperopt/hyperopt-sklearn>
> https://github.com/hyperopt/hyperopt-sklearn but it doesn't include some
> of the other methods used in the paper (like early model termination)
>
> Thanks!
> --
> Pedro Rodriguez
> PhD Student in Distributed Machine Learning | CU Boulder
> UC Berkeley AMPLab Alumni
>
> ski.rodrig...@gmail.com | pedrorodriguez.io | 909-353-4423
> Github: github.com/EntilZha | LinkedIn:
> https://www.linkedin.com/in/pedrorodriguezscience
>
>
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup 
> Now!http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
>
>
>
> _______________________________________________
> Scikit-learn-general mailing 
> listScikit-learn-general@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
>
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Project/PR Idea for Faster Automated Model Search

Reply via email to