Eager to learn! Diving on the code right now!

Thanks for the tip!
Manuel

2017-12-19 14:18 GMT+01:00 Guillaume Lemaître <g.lemaitr...@gmail.com>:

> I think that you could you use imbalanced-learn regarding the issue that
> you have with the y.
> You should be able to wrap your clustering inside the FunctionSampler (
> https://github.com/scikit-learn-contrib/imbalanced-learn/pull/342 - we
> are on the way to merge it)
>
> On 19 December 2017 at 13:44, Manuel Castejón Limas <
> manuel.caste...@gmail.com> wrote:
>
>> Dear all,
>>
>> Kudos to scikit-learn! Having said that, Pipeline is killing me not being
>> able to transform anything other than X.
>>
>> My current study case would need:
>> - Transformers being able to handle both X and y, e.g. clustering X and y
>> concatenated
>> - Pipeline being able to change other params, e.g. sample_weight
>>
>> Currently, I'm augmenting X through every step with the extra information
>> which seems to work ok for my_pipe.fit_transform(X_train,y_train) but
>> breaks on my_pipe.transform(X_test) for the lack of the y parameter. Ok, I
>> can inherit and modify a descendant from Pipeline class to allow the y
>> parameter which is not ideal but I guess it is an option. The gritty part
>> comes when having to adapt every regressor at the end of the ladder in
>> order to split the extra information from the raw data in X and not being
>> able to generate more than one subproduct from each preprocessing step
>>
>> My current research involves clustering the data and using that
>> classification along with X in order to predict outliers which generates
>> sample_weight info and I would love to use that on the final regressor.
>> Currently there seems not to be another option than pasting that info on X.
>>
>> All in all, I'm stuck with this API limitation and I would love to learn
>> some tricks from you if you could enlighten me.
>>
>> Thanks in advance!
>>
>> Manuel Castejón-Limas
>>
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>
>
> --
> Guillaume Lemaitre
> INRIA Saclay - Parietal team
> Center for Data Science Paris-Saclay
> https://glemaitre.github.io/
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to