Hi sklearners!

I think mini-batch training and averaging are both desirable features for
the SGD class.

As far as I know using mini-batches can prevent getting stuck in local
minima and might
converge faster than plain SGD. Also, there is the possibility to
parallelize the gradient
computation inside one mini-batch or use a matrix multiplication, so it
might speed up
computation as well.

I don't know what the coming NN class in sklearn uses, but I think
mini-batch SGD is often
used in NN training, so if the implementation uses the SGD class, it could
be valuable for
the performance of the NN in sklearn as well.

Anyway, as it is not too much code to add and batch-size just another
parameter for the
class, that can be turned off (or be 1) by default, optimized like any
other hyperparameter
and easily described to new users I see only advantages in adding it.

Best,
Tobias





On Sun, May 4, 2014 at 9:48 AM, Alexandre Gramfort <
alexandre.gramf...@telecom-paristech.fr> wrote:

> hi sklearners,
>
> FYI Danny Sullivan https://github.com/dsullivan7
> will work at Telecom ParisTech with me as a scikit-learn
> engineer starting this summer. These topics
> (SGD improvements, averaging, SAG etc.) are part
> of the roadmap.
>
> I think he will start by setting up a benchmark for online
> supervised estimators on sparse and dense data and
> then bench the different stochastic variants.
>
> However I am sure that any work done before he starts
> this summer will be useful :)
>
> Before Danny actually starts we'll send an email to the list
> to detail the plan and get feedback.
>
> Best,
> Alex
>
>
> On Sat, May 3, 2014 at 7:08 PM, Andy <t3k...@gmail.com> wrote:
> > On 05/03/2014 05:37 PM, Mathieu Blondel wrote:
> >
> > Same feeling as Andy. I'd favor implementing averaging instead.
> >
> > I totally forgot that we don't have averaging yet ^^
> > I'd be in favor of geometric averaging: http://arxiv.org/abs/1212.2002
> >
> >
> >
> >
> > Mathieu
> >
> >
> > On Sun, May 4, 2014 at 12:23 AM, Andy <t3k...@gmail.com> wrote:
> >>
> >> Hi Sean.
> >> For linear classifiers I'm not really aware of benefits in doing
> >> mini-batch training, and I don't think it is widely used (someone
> correct me
> >> if I'm wrong).
> >> Usually we only like to add features that have a clear benefit for the
> >> users, to prevent scikit-learn from becoming bloated.
> >>
> >> Do you have a particular use-case where it is important?
> >>
> >> Cheers,
> >> Andy
> >>
> >>
> >> On 05/01/2014 02:04 AM, Sean Violante wrote:
> >>
> >> Hi
> >>
> >> I was wondering if there is any interest in implementing minibatch/batch
> >> for the SGD algorithm. As I understand it, this is not implemented
> >>
> >> "There is a compromise between the two forms, which is often called
> >> "mini-batches", where the true gradient is approximated by a sum over a
> >> small number of training examples."
> >>
> >> http://en.wikipedia.org/wiki/Stochastic_gradient_descent
> >>
> >>
> >>
> >> This would be doing a partial_fit (on small number of training examples)
> >> but updating the weights only after each epoch rather than after each
> >> training sample
> >>
> >>
> >> as far as I can see it would only require a flag in the sgd_fast.pyx
> code.
> >>
> >> thanks
> >>
> >> Sean
> >>
> >>
> >>
> >>
> ------------------------------------------------------------------------------
> >> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
> >> Instantly run your Selenium tests across 300+ browser/OS combos.  Get
> >> unparalleled scalability from the best Selenium testing platform
> >> available.
> >> Simple to use. Nothing to install. Get started now for free."
> >> http://p.sf.net/sfu/SauceLabs
> >>
> >>
> >>
> >> _______________________________________________
> >> Scikit-learn-general mailing list
> >> Scikit-learn-general@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> >>
> >>
> >>
> >>
> >>
> ------------------------------------------------------------------------------
> >> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
> >> Instantly run your Selenium tests across 300+ browser/OS combos.  Get
> >> unparalleled scalability from the best Selenium testing platform
> >> available.
> >> Simple to use. Nothing to install. Get started now for free."
> >> http://p.sf.net/sfu/SauceLabs
> >> _______________________________________________
> >> Scikit-learn-general mailing list
> >> Scikit-learn-general@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> >>
> >
> >
> >
> >
> ------------------------------------------------------------------------------
> > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
> > Instantly run your Selenium tests across 300+ browser/OS combos.  Get
> > unparalleled scalability from the best Selenium testing platform
> available.
> > Simple to use. Nothing to install. Get started now for free."
> > http://p.sf.net/sfu/SauceLabs
> >
> >
> >
> > _______________________________________________
> > Scikit-learn-general mailing list
> > Scikit-learn-general@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> >
> >
> >
> >
> ------------------------------------------------------------------------------
> > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
> > Instantly run your Selenium tests across 300+ browser/OS combos.  Get
> > unparalleled scalability from the best Selenium testing platform
> available.
> > Simple to use. Nothing to install. Get started now for free."
> > http://p.sf.net/sfu/SauceLabs
> > _______________________________________________
> > Scikit-learn-general mailing list
> > Scikit-learn-general@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> >
>
>
> ------------------------------------------------------------------------------
> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
> Instantly run your Selenium tests across 300+ browser/OS combos.  Get
> unparalleled scalability from the best Selenium testing platform available.
> Simple to use. Nothing to install. Get started now for free."
> http://p.sf.net/sfu/SauceLabs
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.  Get 
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to