Hi I think your hypothesis is correct. We recently switched from having one splitter for all trees, to having one splitter per tree. I can submit a hotfix tonight to prevent the data from being held multiple times
Jacob On Thu, Oct 8, 2015 at 3:16 PM, Andreas Mueller <t3k...@gmail.com> wrote: > Hm, that does sound a bit odd. > Maybe the memory_profiler will shed light on it? > https://pypi.python.org/pypi/memory_profiler > > So if you use less than 100 trees it runs through? > > Andy > > > > On 10/08/2015 06:12 PM, Peter Rickwood wrote: > > > > Hello all, > > > I'm puzzled by the memory use of sklearns GBM implementation. It takes up > all available memory and is forced to terminate by the OS, and I cant think > of why it is using as much memory as it does. > > Here is the siituation: > > I have modest data set of size ~ 4GB (1800 columns, 550000 rows, all read > in to a float32 matrix) > > > I can read this in and start training a GBM with no memory issues, but the > memory use climbs rapidly as I add more estimators to the GBM. Once I get > to about 100 trees it is using ~50GB of memory, which kills my laptop. > > I dont understand why this is happening. Each tree is shallow (depth 3) so > shouldn't take up much memory. The only way I can understand the behaviour > is if the data is somehow getting copied and stored for each instance of > the tree. > > What am I missing? > > > > Thanks in advance > > > > Peter > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------------ > > > > _______________________________________________ > Scikit-learn-general mailing > listScikit-learn-general@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/scikit-learn-general > > > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > >
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general