Github user yanboliang commented on the issue:
https://github.com/apache/spark/pull/16011
@MLnick Your description is totally correct. However, the ```model``` you
used in your example is type of ```Bucketizer```. I will keep
```handleInvalid``` in ```Bucketizer```. In the current ML code,
```QuantileDiscretizer```(which is an estimator) will produce
```Bucketizer```(which is a model), and the invalid data handling only happens
when you use ```Bucketizer``` to transform some data(may be not the same data
used for training). When you run ```QuantileDiscretizer```, all NaN values will
be ignored and no any error raised. That is the cause why I propose to remove
```handleInvalid``` from ```QuantileDiscretizer```.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]