[
https://issues.apache.org/jira/browse/SPARK-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Patrick Wendell updated SPARK-1303:
-----------------------------------
Fix Version/s: (was: 1.0.0)
> Added discretization capability to MLlib.
> -----------------------------------------
>
> Key: SPARK-1303
> URL: https://issues.apache.org/jira/browse/SPARK-1303
> Project: Spark
> Issue Type: New Feature
> Components: MLlib
> Reporter: LIDIAgroup
>
> Some time ago, we have commented with Ameet Talwalkar the possibilty of
> including both Feature Selection and Discretization algorithms to MLlib.
> In this patch we've implemented Entropy Minimization Discretization following
> the algorithm described in the paper "Multi-interval discretization of
> continuous-valued attributes for classification learning" by Fayyad and Irani
> (1993). This is one of the most used Discretizers and is already included in
> most libraries like Weka, etc. This can be used as base for FS algorims and
> the NaiveBayes already included in MLlib.
--
This message was sent by Atlassian JIRA
(v6.2#6252)