Hi, Andy,

the models that I am using are Random Forests and naive Bayes classifiers.

Maybe it's something in scipy according to Manoj's linked discussion ... in any 
case, maybe a workaround for this issue and future issues would be to have a 
"forxe_clear_gc" (default=False) parameter to force the garbage collector to be 
emptied after every cycle for estimators and GridSearch?

Here are more details about the particular system setup. scikit-learn and scipy 
should be up to the most recent versions. Python is installed via conda.

Python 3.4.2 |Continuum Analytics, Inc.| (default, Oct 21 2014, 17:16:37) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
import sklearn
i>>> import scipy
scipy.__version__
'0.14.0'
sklearn.__version__
'0.15.2'

Best,
Sebastian


> On Dec 16, 2014, at 11:33 AM, Andy <t3k...@gmail.com> wrote:
> 
> Hi.
> 
> Which models are you using and which version of scikit-learn?
> 
> Cheers,
> Andy
> 
> On 12/16/2014 11:19 AM, Sebastian Raschka wrote:
>> Hi all,
>> 
>> I am wondering if someone noticed that GridSearch is eating more and more 
>> memory over time? I read related discussion on the issue list on GitHub and 
>> it sounds like that it has been solved (estimators are not kept anymore, and 
>> the best estimator can optionally be refitted at the end of the GridSearch).
>> 
>> However, when I ran the GridSearch, I noticed that it always "crashed" after 
>> a couple of hours. When I monitored the system usage over time, I saw the 
>> memory utilization (almost linearly) increasing over time until it reached 
>> the 128 Gb max of the machine I was running it on.
>> 
>> I then wrote a naive grid search with nested for loops and it had the same 
>> issues. So, it is probably not the grid search but something with Python ...
>> 
>> Eventually, I added the 2 lines
>> 
>>     gc.collect()
>>     len(gc.get_objects())
>> 
>> which seem to do the trick! Especially the 2nd one. Now, I can run the 
>> gridsearch for hours and with a constant ~6.8 Gb memory utilization.
>> 
>> 
>> I am curious, did anyone else have this memory issue?
>> 
>> Best,
>> Sebastian
>> ------------------------------------------------------------------------------
>> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
>> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
>> with Interactivity, Sharing, Native Excel Exports, App Integration & more
>> Get technology previously reserved for billion-dollar corporations, FREE
>> http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> 
> 
> ------------------------------------------------------------------------------
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
> http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to