Ashutosh Trivedi created SPARK-4038: ---------------------------------------
Summary: Outlier Detection Algorithm for MLlib Key: SPARK-4038 URL: https://issues.apache.org/jira/browse/SPARK-4038 Project: Spark Issue Type: New Feature Components: MLlib Affects Versions: 1.2.0 Reporter: Ashutosh Trivedi The aim of this JIRA is to discuss about which parallel outlier detection algorithms can be included in MLlib. The one which I am familiar with is Attribute Value Frequency (AVF). It scales linearly with the number of data points and attributes, and relies on a single data scan. It is not distance based and well suited for categorical data. In original paper a Parallel version is also given, which is not complected to implement. Here is the Link for the paper http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=4410382 -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org