I think that you could you use imbalanced-learn regarding the issue that you have with the y. You should be able to wrap your clustering inside the FunctionSampler ( https://github.com/scikit-learn-contrib/imbalanced-learn/pull/342 - we are on the way to merge it)
On 19 December 2017 at 13:44, Manuel Castejón Limas < manuel.caste...@gmail.com> wrote: > Dear all, > > Kudos to scikit-learn! Having said that, Pipeline is killing me not being > able to transform anything other than X. > > My current study case would need: > - Transformers being able to handle both X and y, e.g. clustering X and y > concatenated > - Pipeline being able to change other params, e.g. sample_weight > > Currently, I'm augmenting X through every step with the extra information > which seems to work ok for my_pipe.fit_transform(X_train,y_train) but > breaks on my_pipe.transform(X_test) for the lack of the y parameter. Ok, I > can inherit and modify a descendant from Pipeline class to allow the y > parameter which is not ideal but I guess it is an option. The gritty part > comes when having to adapt every regressor at the end of the ladder in > order to split the extra information from the raw data in X and not being > able to generate more than one subproduct from each preprocessing step > > My current research involves clustering the data and using that > classification along with X in order to predict outliers which generates > sample_weight info and I would love to use that on the final regressor. > Currently there seems not to be another option than pasting that info on X. > > All in all, I'm stuck with this API limitation and I would love to learn > some tricks from you if you could enlighten me. > > Thanks in advance! > > Manuel Castejón-Limas > > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > -- Guillaume Lemaitre INRIA Saclay - Parietal team Center for Data Science Paris-Saclay https://glemaitre.github.io/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn