Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/16011
Typically the estimator Params are copied to the model though. How do you
propose to set the handle invalid param in say a pipeline?
On Fri, 25 Nov 2016 at 18:38, Yanbo Liang <[email protected]> wrote:
> @MLnick <https://github.com/MLnick> Your description is totally correct.
> However, the model you used in your example is type of Bucketizer. I will
> keep handleInvalid in Bucketizer. In the current ML code,
> QuantileDiscretizer(which is an estimator) will produce Bucketizer(which
> is a model), and the invalid data handling only happens when you use
> Bucketizer to transform some data(may be not the same data used for
> training). When you run QuantileDiscretizer, all NaN values will be
> ignored and no any error raised. That is the cause why I propose to remove
> handleInvalid from QuantileDiscretizer.
>
> â
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <https://github.com/apache/spark/pull/16011#issuecomment-262993285>, or
mute
> the thread
>
<https://github.com/notifications/unsubscribe-auth/AA_SB78XJugGohjiV29JR5jsk42BDoxUks5rBw8EgaJpZM4K8OHI>
> .
>
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]