I think that you could you use imbalanced-learn regarding the issue that
you have with the y.
You should be able to wrap your clustering inside the FunctionSampler (
https://github.com/scikit-learn-contrib/imbalanced-learn/pull/342 - we are
on the way to merge it)

On 19 December 2017 at 13:44, Manuel Castejón Limas <
manuel.caste...@gmail.com> wrote:

> Dear all,
>
> Kudos to scikit-learn! Having said that, Pipeline is killing me not being
> able to transform anything other than X.
>
> My current study case would need:
> - Transformers being able to handle both X and y, e.g. clustering X and y
> concatenated
> - Pipeline being able to change other params, e.g. sample_weight
>
> Currently, I'm augmenting X through every step with the extra information
> which seems to work ok for my_pipe.fit_transform(X_train,y_train) but
> breaks on my_pipe.transform(X_test) for the lack of the y parameter. Ok, I
> can inherit and modify a descendant from Pipeline class to allow the y
> parameter which is not ideal but I guess it is an option. The gritty part
> comes when having to adapt every regressor at the end of the ladder in
> order to split the extra information from the raw data in X and not being
> able to generate more than one subproduct from each preprocessing step
>
> My current research involves clustering the data and using that
> classification along with X in order to predict outliers which generates
> sample_weight info and I would love to use that on the final regressor.
> Currently there seems not to be another option than pasting that info on X.
>
> All in all, I'm stuck with this API limitation and I would love to learn
> some tricks from you if you could enlighten me.
>
> Thanks in advance!
>
> Manuel Castejón-Limas
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>


-- 
Guillaume Lemaitre
INRIA Saclay - Parietal team
Center for Data Science Paris-Saclay
https://glemaitre.github.io/
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to