[
https://issues.apache.org/jira/browse/METRON-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15658165#comment-15658165
]
ASF GitHub Bot commented on METRON-562:
---------------------------------------
Github user cestella commented on a diff in the pull request:
https://github.com/apache/incubator-metron/pull/352#discussion_r87657821
--- Diff:
metron-analytics/metron-profiler/src/main/java/org/apache/metron/profiler/bolt/ProfileBuilderBolt.java
---
@@ -246,6 +247,8 @@ private void flush(Tuple tickTuple) {
// clear the execution state to prepare for the next window
executor.clearState();
+ profileConfig.getTickUpdate().forEach((var, expr) ->
executor.assign(var, expr, new HashMap<>()));
--- End diff --
I do not need to access the message, but rather just the internal state. I
have made a modification which does what I think it should do with some
comments. If you see issues, I'd love it if you could let me have it. :)
Taking a step back, the purpose of this is in situations where we need to
do things like capture a trailing window of state. In the instance of Median
Absolute Deviation it is required because we need not just the distribution of
the values, but also the distribution of the DEVIATION of the value from the
median (which is the median of the values of the trailing window). That window
can be updated upon tick. So, for instance, if we want 3 hours lookback, at
hour 4, we'd have hours 1, 2 and 3. At hour 5, we'd want hours 2, 3, and 4 and
so on. Constructing this trailing window lead me to this. I could have
captured this state internally in memory, but that had the issue that internal
state for the window would be removed for the entire trailing window upon
worker crash/restart.
> Add rudimentary statistical outlier detection
> ---------------------------------------------
>
> Key: METRON-562
> URL: https://issues.apache.org/jira/browse/METRON-562
> Project: Metron
> Issue Type: New Feature
> Reporter: Casey Stella
> Assignee: Casey Stella
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> With the advent of the profiler, we can now capture state. Furthermore, with
> Stellar, we can capture statistical summaries. We should provide rudimentary
> outlier detection functionality in the form of Stellar functions that can
> operate on captured state from the profiler.
> To begin, we should enable simple outlier tests using distance from a central
> measure such as Median Absolute Deviation (see
> http://www.itl.nist.gov/div898/handbook/eda/section3/eda35h.htm).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)