GitHub user nickwallen opened a pull request:

    https://github.com/apache/incubator-metron/pull/226

    METRON-377 Enable Profiles that Use Non-Single Pass Summary Functions

    ### [METRON-377](https://issues.apache.org/jira/browse/METRON-377)
    
    This pull request replaces #215 .
    
    As of METRON-372 , Profiles can be built using many statistical summaries 
that only require a single-pass over the data.  This is less memory intensive 
and more scalable for high volume loads.
    
    Unfortunately, not all functions can be calculated in a single pass.  In 
particular, the skewness, ketosis and percentiles require all data to be stored 
in memory for the calculation to occur.  The platform was enhanced so that a 
user can leverage skewness, ketosis and percentiles. 
    
    ### Changes
    
    The `STATS_INIT` function was enhanced to accept a `window_size`.  This 
defines the number of input data elements that are maintained in memory.  
    
    If the `window_size` is greater than 0, a rolling window of the most recent 
`window_size` elements is maintained in memory.  The skewness, ketosis and 
percentiles are calculated over this rolling window.  The `window_size` must be 
>0 otherwise these values cannot be calculated.
    
    If a user supplies a `window_size` equal to 0 then the more efficient 
implementation that does not maintain a rolling window in memory is used.  Of 
course, in this case, the skewness, ketosis, and percentiles cannot be 
calculated.
    
    The following additional functions were also added.
    * STATS_KURTOSIS
    * STATS_SKEWNESS
    * STATS_PERCENTILES
    
    Another integration test was added to validate this end-to-end also.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/nickwallen/incubator-metron METRON-377-2

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-metron/pull/226.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #226
    
----
commit f256fe7461a0a8273b0a9f9e2d10a01c7c53c473
Author: Nick Allen <[email protected]>
Date:   2016-08-23T17:03:15Z

    METRON-372 Enhance Statistical Operations Available for Use with the 
Profiler

commit 37d7c3813b4c142fff351cc5b020393a97a5d4d2
Author: Nick Allen <[email protected]>
Date:   2016-08-23T17:18:26Z

    METRON-377 Enable Profiles that Use Non-Single Pass Summary Functions

commit b6284ced44fd3f41945ad9e68267cd84b5aad782
Author: Nick Allen <[email protected]>
Date:   2016-08-23T17:35:24Z

    METRON-377 Added an integration tests to use new stats functionality

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to