GitHub user nickwallen opened a pull request:
https://github.com/apache/incubator-metron/pull/225
METRON-372 Enhance Statistical Operations Available for Use with the
Profiler
### [METRON-372](https://issues.apache.org/jira/browse/METRON-372)
This PR serves as a replacement for #214 .
### Changes
Only basic math functions are currently available in Stellar for use with
the Profiler. This makes life difficult for users to create even basic
profiles like a running average.
This can be seen in the following example where the average must be
calculated manually in Stellar. A variable `sum` and `cnt` must be maintained
and then used to calculate the average.
```
{
"profile": "example3",
"foreach": "ip_src_addr",
"onlyif": "protocol == 'HTTP'",
"init": {
"sum": 0.0,
"cnt": 0.0
},
"update": {
"sum": "sum + resp_body_len",
"cnt": "cnt + 1"
},
"result": "sum / cnt"
}
```
This change introduces a series of summary functions that make creating
profiles much simpler for the user. Instead of re-implementing the calculation
of an average in Stellar, this leverages Commons Math to perform all the heavy
lifting.
The example above for calculating an average can be re-defined as follows.
```
{
"profile": "example3",
"foreach": "ip_src_addr",
"onlyif": "protocol == 'HTTP'",
"init": { "s": "STATS_INIT()" },
"update": { "_": "STATS_ADD(length, s)" },
"result": "STATS_MEAN(s)"
}
```
The following summary functions are supported. These are all statistics
that can be calculated in a single pass. This means that none of the values
being summarized are stored in memory.
* count
* mean
* geometric mean
* max
* min
* sum
* population variance
* variance
* second moment
* quadratic mean
* standard deviation
* sum of logs
* sum of squares.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/nickwallen/incubator-metron METRON-372-NEW
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-metron/pull/225.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #225
----
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---