[ 
https://issues.apache.org/jira/browse/MADLIB-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15336969#comment-15336969
 ] 

ASF GitHub Bot commented on MADLIB-990:
---------------------------------------

GitHub user iyerr3 opened a pull request:

    https://github.com/apache/incubator-madlib/pull/48

    SVM: Novelty detection using 1-class SVM

    Jira: MADLIB-990
    
    Additional author: Nandish Jayaram <[email protected]>
    
    In this implementation of a one-class SVM, we are piggy-backing on the 
existing
    SVM classification. The input table to a one-class SVM does not require a
    dependent variable. A maximum-margin classifier is learned that separates 
all
    the data from the origin. The default kernel for one-class is Gaussian 
(rbf).

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/iyerr3/incubator-madlib feature/svm_one_class

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-madlib/pull/48.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #48
    
----
commit c4211cef7043081852e1346218e81a78800a7428
Author: Rahul Iyer <[email protected]>
Date:   2016-04-21T18:38:28Z

    SVM: Novelty detection using 1-class SVM
    
    Jira: MADLIB-990
    
    Additional author: Nandish Jayaram <[email protected]>
    
    In this implementation of a one-class SVM, we are piggy-backing on the 
existing
    SVM classification. The input table to a one-class SVM does not require a
    dependent variable. A maximum-margin classifier is learned that separates 
all
    the data from the origin. The default kernel for one-class is Gaussian 
(rbf).

----


> SVM - novelty detection using 1-class SVM
> -----------------------------------------
>
>                 Key: MADLIB-990
>                 URL: https://issues.apache.org/jira/browse/MADLIB-990
>             Project: Apache MADlib
>          Issue Type: New Feature
>          Components: Module: Support Vector Machines
>            Reporter: Frank McQuillan
>            Assignee: Nandish Jayaram
>             Fix For: v1.9.1
>
>
> Story
> As a data scientist, I want to use  a one-class SVM so that I can decide 
> whether a new observation belongs to the same distribution as existing 
> observations (an inlier), or should be considered as different (an outlier). 
> Acceptance
> 1) One-class SVM implemented with all supported kernel types (linear, 
> gaussian, polynomial).
> 2) Output a T/F for not-novel/novel.
> Note
> a) Similar e1071 R package [1] with
> type=one-classification (for novelty detection)
> b) There is an important distinction between novelty detection (this story) 
> and outlier detection for cleaning training data.  From reference [2]:
> * novelty detection:  the training data is not polluted by outliers, and we 
> are interested in detecting anomalies in new observations. <- this story
> * outlier detection:  the training data contains outliers, and we need to fit 
> the central mode of the training data, ignoring the deviant observations. <- 
> we are *not* trying to solve this unsupervised learning problem in this story.
> References
> [1] e1071 R package
> https://cran.r-project.org/web/packages/e1071/index.html
> [2] Difference between novelty and outlier detection
> http://scikit-learn.org/stable/modules/outlier_detection.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to