[GitHub] spark issue #16011: [SPARK-18587][ML] Remove handleInvalid from QuantileDisc...

yanboliang Fri, 25 Nov 2016 08:38:40 -0800

Github user yanboliang commented on the issue:

    https://github.com/apache/spark/pull/16011
  
    @MLnick Your description is totally correct. However, the ```model``` you 
used in your example is type of ```Bucketizer```. I will keep 
```handleInvalid``` in ```Bucketizer```. In the current ML code, 
```QuantileDiscretizer```(which is an estimator) will produce 
```Bucketizer```(which is a model), and the invalid data handling only happens 
when you use ```Bucketizer``` to transform some data(may be not the same data 
used for training). When you run ```QuantileDiscretizer```, all NaN values will 
be ignored and no any error raised. That is the cause why I propose to remove 
```handleInvalid``` from ```QuantileDiscretizer```.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #16011: [SPARK-18587][ML] Remove handleInvalid from QuantileDisc...

Reply via email to