[ 
https://issues.apache.org/jira/browse/SPARK-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13967630#comment-13967630
 ] 

Martin Jaggi commented on SPARK-1303:
-------------------------------------

Discretization: see also https://issues.apache.org/jira/browse/SPARK-1216
Can you link the pull request here as well please?

Feature selection: see also https://issues.apache.org/jira/browse/SPARK-1473

> Added discretization capability to MLlib.
> -----------------------------------------
>
>                 Key: SPARK-1303
>                 URL: https://issues.apache.org/jira/browse/SPARK-1303
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>            Reporter: LIDIAgroup
>             Fix For: 1.0.0
>
>
> Some time ago, we have commented with Ameet Talwalkar the possibilty of 
> including both Feature Selection and Discretization algorithms to MLlib.
> In this patch we've implemented Entropy Minimization Discretization following 
> the algorithm described in the paper "Multi-interval discretization of 
> continuous-valued attributes for classification learning" by Fayyad and Irani 
> (1993). This is one of the most used Discretizers and is already included in 
> most libraries like Weka, etc. This can be used as base for FS algorims and 
> the NaiveBayes already included in MLlib.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to