Github user nickwallen commented on the issue:

    https://github.com/apache/incubator-metron/pull/395
  
    I want to provide some feedback on @cestella comments on changes to the 
Profiler Client API.  Before I do that I want to make sure that we're all on 
the same page about usage scenarios for this functionality.
    
    ##### "Live" Data
    
    The most common use case is creating profiles on live, streaming data.  In 
this case the processing time and event time will normally remain close, but 
could differ under abnormal conditions.  
    
    Note that it is still very valuable to use event time processing in this 
scenario. Using event time here has the following advantages.
     * Profiles are not skewed by high demand that might delay processing
     * Allows the Profiler to take planned/unplanned outages and pick up where 
it left off
     * Produces more accurate behavioral profiles when there is a time 
difference between when a behavior occurs and when the telemetry produced to 
tell us about that behavior is received.  For example, think of a sensor that 
collects data in batches or mini-batches where we get data on regular 
intervals; every 10 minutes, hourly, etc.
    
    ##### Replayed Data
    
    The other use case that this positions us for is creating profiles from 
replayed or reprocessed archival data.  I am creating a model based on a new 
feature that the Profiler is generating for me.  When I move that model into 
Production, I need a historical view of that feature, to train my model.  I can 
replay archived telemetry through the Profiler generating that history of my 
new feature.  I think I put more examples of this in the original JIRA too.
    
    This PR doesn't actually deliver all we need to handle replaying data.  
This just provides one critical component.  I don't want to give anyone the 
impression that this PR allows us to replay data at this point in time.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to