Re: [Scikit-learn-general] Re-cycling pipeline stages in GridSearchCV?

2013-06-08 Thread Joel Nothman
On Fri, Jun 7, 2013 at 11:59 PM, Gael Varoquaux gael.varoqu...@normalesup.org wrote: Memorization and parallelization don't play along nicely. Yes, I am strongly thinking of adding optional memoization directly to joblib.Parallel. It is often a fairly natural place to put a memoization as

Re: [Scikit-learn-general] Re-cycling pipeline stages in GridSearchCV?

2013-06-08 Thread Gael Varoquaux
I don't see how that helps Pipeline; perhaps expand your idea a bit...? It doesn't. At all. I think that pipeline can be improved by memoizing the transforms, or the transformer's fit. G -- How ServiceNow helps IT

Re: [Scikit-learn-general] Re-cycling pipeline stages in GridSearchCV?

2013-06-07 Thread Andreas Mueller
On 06/07/2013 12:08 AM, Joel Nothman wrote: I proposed something that did this among a more general solution for warm starts without memoizing a couple of weeks ago, but I think memoizing is neater and handles most cases. To handle it generally, you could add a memoize parameter to

Re: [Scikit-learn-general] Re-cycling pipeline stages in GridSearchCV?

2013-06-07 Thread Gael Varoquaux
Memorization and parallelization don't play along nicely. Yes, I am strongly thinking of adding optional memoization directly to joblib.Parallel. It is often a fairly natural place to put a memoization as structures should be pickleable and data transfer should be limited. What do people think?

[Scikit-learn-general] Re-cycling pipeline stages in GridSearchCV?

2013-06-06 Thread Romaniuk, Michal
Hi, I noticed that GridSearchCV fits a new estimator from scratch for each grid point. But when working with pipelines where multiple steps have tuning parameters, some time could be saved by fitting an early step once and then fitting the later steps along a sequence of grid points while

Re: [Scikit-learn-general] Re-cycling pipeline stages in GridSearchCV?

2013-06-06 Thread Gael Varoquaux
Using in a clever way a joblib.Memory would be the way I would like to address this. I have no precise idea on how I would do this, though. G -- How ServiceNow helps IT people transform IT departments: 1. A cloud service

Re: [Scikit-learn-general] Re-cycling pipeline stages in GridSearchCV?

2013-06-06 Thread Joel Nothman
I proposed something that did this among a more general solution for warm starts without memoizing a couple of weeks ago, but I think memoizing is neater and handles most cases. To handle it generally, you could add a memoize parameter to Pipeline. Then I guess you'd have to do some subset of: *