Hey all - I'm running into some memory issues with GridSearchCV and I
wonder if anyone can give an intuition as to why.

I'm cross-validating alpha parameters for Ridge regression. I'm trying 8
different parameters. My inputs are 2400x1900 (~370MB) in size.

When I run

%memit model.fit(X, y)


alone, then I get a peak memory usage of ~370, just double the size of
inputs I gave the function.

However, when I run

grid = GridSearchCV(model, {'alpha':[list of 6 values]}, cv=5)
> grid.fit(X, y)


I get a whopping 8120MB increment in memory.

Is this normal behavior? It seems that keeping n_jobs == 1 should prevent
any data copying from happening, but it clearly something is being copied
many many times.

Any idea what's going on?

Chris
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://p.sf.net/sfu/Zoho
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to