Data cleaning @ enrichment

Could you link an example for a mixing?

Currently this is a bit if a mess with custom pickle persistence in a big
for loop and custom transformers

Thanks.
Georg
Joel Nothman <joel.noth...@gmail.com> schrieb am Mi. 16. Aug. 2017 um 13:51:

> We certainly considered this over the many years that Pipeline caching has
> been in the pipeline. Storing the fitted model means we can do both a
> fit_transform and a transform on new data, and in many cases takes away the
> pain point of CV over pipelines where downstream steps are varied.
>
> What transformer are you using where the transform is costly? Or is it
> more a matter of you wanting to store the transformed data at each step?
>
> There are custom ways to do this sort of thing generically with a mixin if
> you really want.
>
> On 16 August 2017 at 21:28, Georg Heiler <georg.kf.hei...@gmail.com>
> wrote:
>
>> There is a new option in the pipeline:
>> http://scikit-learn.org/stable/modules/pipeline.html#pipeline-cache
>> How can I use this to also store the transformed data as I only want to
>> compute the last step i.e. estimator during hyper parameter tuning and not
>> the transform methods of the clean steps?
>>
>> Is there a possibility to apply this for crossvalidation? I would want to
>> see all the folds precomputed and stored to disk in a folder.
>>
>> Regards,
>> Georg
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to