Eager to learn! Diving on the code right now! Thanks for the tip! Manuel
2017-12-19 14:18 GMT+01:00 Guillaume Lemaître <g.lemaitr...@gmail.com>: > I think that you could you use imbalanced-learn regarding the issue that > you have with the y. > You should be able to wrap your clustering inside the FunctionSampler ( > https://github.com/scikit-learn-contrib/imbalanced-learn/pull/342 - we > are on the way to merge it) > > On 19 December 2017 at 13:44, Manuel Castejón Limas < > manuel.caste...@gmail.com> wrote: > >> Dear all, >> >> Kudos to scikit-learn! Having said that, Pipeline is killing me not being >> able to transform anything other than X. >> >> My current study case would need: >> - Transformers being able to handle both X and y, e.g. clustering X and y >> concatenated >> - Pipeline being able to change other params, e.g. sample_weight >> >> Currently, I'm augmenting X through every step with the extra information >> which seems to work ok for my_pipe.fit_transform(X_train,y_train) but >> breaks on my_pipe.transform(X_test) for the lack of the y parameter. Ok, I >> can inherit and modify a descendant from Pipeline class to allow the y >> parameter which is not ideal but I guess it is an option. The gritty part >> comes when having to adapt every regressor at the end of the ladder in >> order to split the extra information from the raw data in X and not being >> able to generate more than one subproduct from each preprocessing step >> >> My current research involves clustering the data and using that >> classification along with X in order to predict outliers which generates >> sample_weight info and I would love to use that on the final regressor. >> Currently there seems not to be another option than pasting that info on X. >> >> All in all, I'm stuck with this API limitation and I would love to learn >> some tricks from you if you could enlighten me. >> >> Thanks in advance! >> >> Manuel Castejón-Limas >> >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> >> > > > -- > Guillaume Lemaitre > INRIA Saclay - Parietal team > Center for Data Science Paris-Saclay > https://glemaitre.github.io/ > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn