Thank you for your feedback!

On 10/01/2018 09:11 PM, Jason Sanchez wrote:
The current roadmap is amazing. One feature that would be exciting is better support for multilayer stacking with caching and the ability to add models to already trained layers.

I saw this history: https://github.com/scikit-learn/scikit-learn/pull/8960

I think we still want to include something like this. I guess maybe it wasn't thought of as major enough to make the roadmap. The roadmap mostly has API changes and things that impact more than one estimator. This is "just" adding an estimator for the most part.

This library is very close:
* API is somewhat awkward, but otherwise good. Does not cache intermediate steps. https://wolpert.readthedocs.io/en/latest/index.html
If we reuse pipelines, we might get this "for free" to some degree.


As another data point, I attached a simple implementation I put together to illustrate what I think are core needs of this feature. Feel free to browse the code. Here is the short list:
* Infinite layers (or at least 3 ;) )
Pretty sure that'll happen
* Choice of CV or OOB for each model
This is less likely to happen in an initial version, I think. These two things have traditionally been very separate. We could potentially
add to the roadmap to make this easier? (actually I just did)
* Ability to add a new model to a layer after the stacked ensemble has been trained and refit the pipeline such that only models that must be retrained are retrained (i.e. train the added model and retrain all models in higher layers)
This is the "freezing estimators" that's on the roadmap.
* All standard scikit-learn pipeline goodness (introspection, grid search, serializability, etc)

That's a given for anything in sklearn ;)
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to