Hi Jacob,

Thank you for clarification.

My problem however is the size of data in terms of number of samples. The
features are engineered and are only 80. I wanted to try training on bigger
set of data for improvement.

Thanks & Best,
Maryam


>
> ------------------------------
>
> Message: 3
> Date: Wed, 30 Sep 2015 12:11:00 -0700
> From: Jacob Schreiber <jmschreibe...@gmail.com>
> Subject: Re: [Scikit-learn-general] Scalability of Gradient Boosting
>         Classifier
> To: scikit-learn-general@lists.sourceforge.net
> Message-ID:
>         <
> ca+ad8etyev331pfafp60jctxj5sbtbsf+yj19kg9tzodee-...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hi Maryam
>
> Currently, no tree based methods have a partial fit method. We are
> currently working on expanding the tree module, you can see our checklist
> here; https://github.com/scikit-learn/scikit-learn/issues/5212
>
> There are many methods to reduce the dimensionality of data, if you are
> using high dimensional data, including Gaussian random projections or LSH
> for continuous data, and collapsing samples into higher weight samples for
> discrete data. If you provide more information about your use case, I may
> be able to be of more help.
>
> Jacob
>
> On Wed, Sep 30, 2015 at 5:12 AM, Maryam Tavakol <
> maryam.tavakol...@gmail.com
> > wrote:
>
> > Dear all,
> >
> > I am using Gradient Boosting Classifier from scikit-learn for a huge set
> > of data. Unfortunately, the method loads the whole data into memory
> (around
> > 45 GBs!). As it is not very easy to modify the code to stream data, is
> > there any other way to make it scalable?
> >
> > Best Regards,
> > Maryam Tavakol
> >
> >
> >
> ------------------------------------------------------------------------------
> >
> > _______________________________________________
> > Scikit-learn-general mailing list
> > Scikit-learn-general@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> >
> >
>
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to