Hello! I use a 2-step Pipeline with an expensive transformer and a classification afterwards. On this I do GridSearchCV of the classifcation parameters.
Now, theoretically GridSearchCV could know that I'm not touching any parameters of the transformer and avoid re-doing work by keeping the transformed X, right?! Currently, GridSearchCV will do a clean re-run of all Pipeline steps? Can you recommend the easiest way for me to use GridSearchCV+Pipeline while avoiding recomputation of all transformer steps whose parameters are not in the GridSearch? I realize this may be tricky, but any pointers to realize this most conveniently and compatible with sklearn would be highly appreciated! (The scoring has to be done on the initial data, so I cannot just manually transform beforehand.) Regards, Anton PS: If that all makes sense, is that a useful feature to include in sklearn?
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn