[ 
https://issues.apache.org/jira/browse/METRON-594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16256529#comment-16256529
 ] 

Otto Fowler commented on METRON-594:
------------------------------------

This would be replaying the original data or the processed data?

> Replay Telemetry Data through Profiler
> --------------------------------------
>
>                 Key: METRON-594
>                 URL: https://issues.apache.org/jira/browse/METRON-594
>             Project: Metron
>          Issue Type: Improvement
>            Reporter: Nick Allen
>
> The Profiler currently consumes live telemetry, in real-time, as it is 
> streamed through Metron.  A useful extension of this functionality would 
> allow the Profiler to also consume archived, historical telemetry.  Allowing 
> a user to selectively replay archived, historical raw telemetry through the 
> Profiler has a number of applications. The following use cases help describe 
> why this might be useful.
> Use Case 1 - Model Development
> When developing a new model, I often need a feature set of historical data on 
> which to train my model.  I can either wait days, weeks, months for the 
> Profiler to generate this based on live data or I could re-run the raw, 
> historical telemetry through the Profiler to get started immediately.  It is 
> much simpler to use the same mechanism to create this historical data set, 
> than a separate batch-driven tool to recreate something that approximates the 
> historical feature set.
> Use Case 2 - Model Deployment 
> When deploying an analytical model to a new environment, like production, on 
> day 1 there is often no historical data for the model to work with.  This 
> often leaves a gap between when the model is deployed and when that model is 
> actually useful.  If I could replay raw telemetry through the profiler a 
> historical feature set could be created as part of the deployment process.  
> This allows my model to start functioning on day 1.
> Use Case 3 - Profile Validation
> When creating a Profile, it is difficult to understand how the configured 
> profile might behave against the entire data set.  By creating the profile 
> and watching it consume real-time streaming data, I only have an 
> understanding of how it behaves on that small segment of data.  If I am able 
> to replay historical telemetry, I can instantly understand how it behaves on 
> a much larger data set; including all the  anomalies and exceptions that 
> exist in all large data sets.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to