[ 
https://issues.apache.org/jira/browse/METRON-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15655084#comment-15655084
 ] 

ASF GitHub Bot commented on METRON-562:
---------------------------------------

Github user cestella commented on the issue:

    https://github.com/apache/incubator-metron/pull/352
  
    I want to also summarize a couple of my thoughts that I had in the course 
of creating this PR.  It appears to me that there are a number of these sorts 
of outlier analysis models that can be split into a common pattern:
    * Gather state via the profiler.
    * Operate on that state during the scoring mechanism
    
    For these z-score based statistical outlier analysis techniques/models, 
that state comes in the form of a statistical summary or multiple statistical 
summaries in the case of median absolute deviation.  The scoring component is 
literally composing a modified z-score (scaled by MAD).  The question arises as 
to whether or not this model is better served in MaaS or as a stellar function. 
 Where I have come down in this PR is that I feel these very simple z-score or 
deviation-from-a-central-moment style outlier models fit better within Stellar 
due to their simplicity from a computational perspective.  Furthermore, it 
would seem to be less than ideal performance-wise to perform a network hop to a 
REST interface in MaaS to do simple subtraction and division.
    
    An alternate architecture here would be to use MaaS and have the model pull 
the appropriate statistical context from the profiler's store in HBase.   I am 
sensitive to the possibility that the placement of model application within 
Metron may be a point of confusion and a consistent message is important.  I 
decided to go with simplicity and performance for the moment, but I'd love to 
hear if there is opposition in the audience.


> Add rudimentary statistical outlier detection
> ---------------------------------------------
>
>                 Key: METRON-562
>                 URL: https://issues.apache.org/jira/browse/METRON-562
>             Project: Metron
>          Issue Type: New Feature
>            Reporter: Casey Stella
>            Assignee: Casey Stella
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> With the advent of the profiler, we can now capture state.  Furthermore, with 
> Stellar, we can capture statistical summaries.  We should provide rudimentary 
> outlier detection functionality in the form of Stellar functions that can 
> operate on captured state from the profiler.
> To begin, we should enable simple outlier tests using distance from a central 
> measure such as Median Absolute Deviation (see 
> http://www.itl.nist.gov/div898/handbook/eda/section3/eda35h.htm).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to