[ 
https://issues.apache.org/jira/browse/METRON-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15749357#comment-15749357
 ] 

ASF GitHub Bot commented on METRON-590:
---------------------------------------

GitHub user nickwallen opened a pull request:

    https://github.com/apache/incubator-metron/pull/395

    METRON-590 Enable Use of Event Time in Profiler

    ## [METRON-590](https://issues.apache.org/jira/browse/METRON-590)
    
    ### Changes
    
    * Added event time processing support to the Profiler.  Previously the 
Profiler only supported processing time aka wall clock time processing.  Event 
time processing is advantageous as it is not susceptible to skew caused by 
heavy processing load, allows the reprocessing/replay of archived telemetry 
data, and under certain circumstances can produce a more accurate profile of 
entity behavior.
    
    * By default, the Profiler will use event time processing.  The Flux 
topology definition file must be edited to switch the Profiler to wall clock or 
processing time.
    
    * The Profiler is now leveraging Storm's windowing functionality introduced 
in Storm 1.x.  This provides the core engine for event time processing.  This 
also provides a means for the use of different window types, like sliding 
windows, in the Profiler.  This is currently not exposed to users of the 
Profiler as the Flux topology definition file must be edited to use different 
window types.
    
    * Enhanced the Profiler integration tests which was enabled by the use of 
event time processing.  The integration tests now generate 24 hours of 
telemetry data at roughly 3 messages per minute, and then flush profile values 
every 15 minutes.  The entire stream of values generated by the Profiler is 
then validated for correctness.
    
    *  Created a `ConfigurationManager` that can be used to read the latest 
configuration changes in a remote data store like Zookeeper.  The default 
implementation, `ZkConfigurationManager` replicates the functionality that is 
embedded in the `ConfiguredBolt` base class.  The Profiler bolts can no longer 
subclass `ConfiguredBolt` as it subclasses Storm's `BaseRichBolt` which will 
not work for the Profiler bolts.
    
    * The usability of the Profiler was enhanced to better support active 
profiles that are subsequently edited by the user. Changes should be handled 
seamlessly by the Profiler.  This is especially helpful when a mistake is made 
when creating a profile, which then needs to be fixed and updated.  The 
Profiler was also made more resilient to failures specific to a single Profile 
or Tuple.  Individual failures should not impact other Profiles or Tuples.
    
    ### Testing
    
    Tested on a multi-node AWS cluster and the Quick Dev environment. Created, 
edited, and deleted multiple profile definitions as the Profiler was running 
and responding to the changes.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/nickwallen/incubator-metron METRON-590

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-metron/pull/395.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #395
    
----
commit cca756b7781ee7058edadfa84777bce7286d7817
Author: Nick Allen <[email protected]>
Date:   2016-12-07T20:14:07Z

    METRON-590 Enable Use of Event Time in Profiler

----


> Enable Use of Event Time in Profiler
> ------------------------------------
>
>                 Key: METRON-590
>                 URL: https://issues.apache.org/jira/browse/METRON-590
>             Project: Metron
>          Issue Type: Improvement
>            Reporter: Nick Allen
>            Assignee: Nick Allen
>
> There are at least two different times that are important to consider when 
> handling the telemetry messages received by Metron.  
> (1) Processing time is the time at which Metron processed the message.  
> (2) Event time is the time at which the event actually occurred.
> If Metron is consuming live data and all is well, the processing and event 
> times may remain close and consistent. When processing time differs from 
> event time the data produced by the Profiler may be inaccurate.  There are a 
> few scenarios under which these times might differ greatly which would 
> negatively impact the feature set produced by the Profiler.  
> (1) When the system has experienced an outage, for example, a scheduled 
> maintenance window. When restarted a high volume of messages will need to be 
> processed by the Profiler.  The output of the Profiler will indicate an 
> increase in activity, although no change in activity actually occurred on the 
> target network.  This could happen whether the outage was Metron itself or an 
> upstream system that feeds data to Metron.
> (2) If the user attempts to replay historical telemetry through the Profiler, 
> the Profiler will attribute the activity to the time period in which it was 
> processed.  Obviously the activity should be attributed to the time period in 
> which the raw telemetry events originated in.
> There are some scenarios when processing time might be preferred and other 
> use cases where event time is preferred.  The Profiler should be enhanced to 
> allow it to produce profiles based on either processing time or event time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to