Hi all,

I am wondering if someone noticed that GridSearch is eating more and more 
memory over time? I read related discussion on the issue list on GitHub and it 
sounds like that it has been solved (estimators are not kept anymore, and the 
best estimator can optionally be refitted at the end of the GridSearch).

However, when I ran the GridSearch, I noticed that it always "crashed" after a 
couple of hours. When I monitored the system usage over time, I saw the memory 
utilization (almost linearly) increasing over time until it reached the 128 Gb 
max of the machine I was running it on.

I then wrote a naive grid search with nested for loops and it had the same 
issues. So, it is probably not the grid search but something with Python ...

Eventually, I added the 2 lines

    gc.collect()
    len(gc.get_objects())

which seem to do the trick! Especially the 2nd one. Now, I can run the 
gridsearch for hours and with a constant ~6.8 Gb memory utilization.


I am curious, did anyone else have this memory issue?

Best,
Sebastian
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to