Hello all,
I'm puzzled by the memory use of sklearns GBM implementation. It takes up all available memory and is forced to terminate by the OS, and I cant think of why it is using as much memory as it does. Here is the siituation: I have modest data set of size ~ 4GB (1800 columns, 550000 rows, all read in to a float32 matrix) I can read this in and start training a GBM with no memory issues, but the memory use climbs rapidly as I add more estimators to the GBM. Once I get to about 100 trees it is using ~50GB of memory, which kills my laptop. I dont understand why this is happening. Each tree is shallow (depth 3) so shouldn't take up much memory. The only way I can understand the behaviour is if the data is somehow getting copied and stored for each instance of the tree. What am I missing? Thanks in advance Peter
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general