[
https://issues.apache.org/jira/browse/METRON-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16133543#comment-16133543
]
ASF GitHub Bot commented on METRON-1120:
----------------------------------------
GitHub user nickwallen opened a pull request:
https://github.com/apache/metron/pull/708
Metron 1120
[METRON-1120](https://issues.apache.org/jira/browse/METRON-1120)
- [ ] This is built on top of METRON-1120 so this should not be committed
before METRON-1120.
The `groupBy` expression can now reference any of these variables.
* `profile` The name of the profile.
* `entity` The name of the entity being profiled.
* `start` The start time of the profile period in epoch milliseconds.
* `end` The end time of the profile period in epoch milliseconds.
* `duration` The duration of the profile period in milliseconds.
* `result` The result of executing the `result` expression.
Unit tests have been added to validate this functionality. The README has
also been updated to describe the fields available to the `groupBy` expression.
This can also be tested manually in either a live Profiler or using the
Profiler debugging functions.
The following shows how this change would be used to implement the
problematic profile described in.
Create a profile that references the start of the profile period in the
`groupBy` expression.
```
[Stellar]>>> conf := SHELL_EDIT()
[Stellar]>>> conf
{
"profiles": [
{
"profile": "calender-effects",
"onlyif": "exists(ip_src_addr) and exists(timestamp)",
"foreach": "ip_src_addr",
"init": { "count": 0 },
"update": { "count": "count + 1" },
"result": "count",
"groupBy": ["DAY_OF_WEEK(start)"]
}
]
}
```
Create a message to exercise the profiler.
```
[Stellar]>>> msg := SHELL_EDIT()
[Stellar]>>> msg
{
"ip_src_addr":"10.0.0.1",
"timestamp":"2017-08-18 09:00:00"
}
```
Create a Profiler and apply the messages to it.
```
[Stellar]>>> p := PROFILER_INIT(conf)
[Stellar]>>> PROFILER_APPLY(msg, p)
org.apache.metron.profiler.StandAloneProfiler@4572b5b4
[Stellar]>>> PROFILER_APPLY(msg, p)
org.apache.metron.profiler.StandAloneProfiler@4572b5b4
[Stellar]>>> PROFILER_APPLY(msg, p)
org.apache.metron.profiler.StandAloneProfiler@4572b5b4
```
Flush the profile and validate the result of executing the `groupBy`. The
value is 6, which indicates Friday, which is correct in this case.
```
[Stellar]>>> PROFILER_FLUSH(p)
[{period={duration=900000, period=1670094, start=1503084600000,
end=1503085500000}, profile=calender-effects, groups=[6], value=3,
entity=10.0.0.1}]
```
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/nickwallen/metron METRON-1120
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/metron/pull/708.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #708
----
commit 5579748ad4336a7c1a15c319d59fd6cbdeb6531e
Author: Nick Allen <[email protected]>
Date: 2017-08-18T17:37:01Z
METRON-1121 Ignore Profile with Bad 'init', 'update' or 'groupBy'
commit 893b7db84f155ea6af975ee51338f39b763eaedb
Author: Nick Allen <[email protected]>
Date: 2017-08-18T17:45:50Z
Rm errant comment
commit da365c8b546678bbe07011e10ab3cd222faa8297
Author: Nick Allen <[email protected]>
Date: 2017-08-18T19:01:26Z
METRON-1120 Profile's 'groupBy' Expression Has No Reference to Time
commit 5d8a7a06096d5aa725a0ce3b47fef36a8e14ac72
Author: Nick Allen <[email protected]>
Date: 2017-08-18T19:04:35Z
Rm artifacts that should not be in Git
commit c52dce2be6146127eed9af0d2b311ff65f0de551
Author: Nick Allen <[email protected]>
Date: 2017-08-18T19:32:07Z
Updated README
commit 54f1c5969268032e0841f0dd4b5e76449b8b3b6f
Author: Nick Allen <[email protected]>
Date: 2017-08-18T19:35:31Z
Fix README
----
> Profile's 'groupBy' Expression Has No Reference to Time
> -------------------------------------------------------
>
> Key: METRON-1120
> URL: https://issues.apache.org/jira/browse/METRON-1120
> Project: Metron
> Issue Type: Bug
> Reporter: Nick Allen
> Assignee: Nick Allen
>
> It is often the case that patterns and behaviors will differ based on
> calendar effects like day of week. For example, activity on a weekday can be
> very different from a weekend. The Profiler's "Group By" functionality is one
> way to account for calendar effects.
> This profile definition operates over any incoming telemetry that has an
> `ip_src_addr` and a `timestamp` field. It produces a profile that segments
> the data by day of week. It does by using a 'groupBy' expression to extract
> the day of week from the telemetry's `timestamp` field.
> {code}
> {
> "profiles": [
> {
> "profile": "calender-effects",
> "onlyif": "exists(ip_src_addr) and exists(timestamp)",
> "foreach": "ip_src_addr",
> "init": { "count": 0 },
> "update": { "count": "count + 1" },
> "result": "count",
> "groupBy": ["DAY_OF_WEEK(TO_EPOCH_TIMESTAMP(timestamp, 'yyyy-MM-dd
> HH:mm:ss', 'GMT'))"]
> }
> ]
> }
> {code}
> When retrieving profile data using the Profiler Client API, I only want to
> retrieve data from the same day of week to account for any calendar effects.
> The following example retrieves profile data only for Thursdays over the past
> 60 days.
> {code}
> >>> thursday := 5
> >>> PROFILE_GET("calendar-effects", "10.0.0.1", PROFILE_FIXED(60, "DAYS"),
> >>> [thursday])
> {code}
> h3. The Problem
> The `groupBy` expression only has access to the Profile's `result` value. It
> does not have any way to reference the current tick time in the Profiler.
> Here is an example showing the problem.
> Define the profile and a message.
> {code}
> [Stellar]>>> conf
> {
> "profiles": [
> {
> "profile": "calender-effects",
> "onlyif": "exists(ip_src_addr) and exists(timestamp)",
> "foreach": "ip_src_addr",
> "init": { "count": "0" },
> "update": { "count": "count + 1" },
> "result": "count",
> "groupBy": ["DAY_OF_WEEK(TO_EPOCH_TIMESTAMP(timestamp, 'yyyy-MM-dd
> HH:mm:ss', 'GMT'))"]
> }
> ]
> }
> [Stellar]>>> msg
> {
> "ip_src_addr": "10.0.0.1",
> "protocol": "HTTPS",
> "length": "10",
> "bytes_in": 234,
> "timestamp": "2017-08-17 09:00:00"
> }
> {code}
> Initialize the Profiler and apply the message a few times.
> {code}
> [Stellar]>>> p := PROFILER_INIT(conf)
> [Stellar]>>> PROFILER_APPLY(msg, p)
> org.apache.metron.profiler.StandAloneProfiler@9472c85
> [Stellar]>>> PROFILER_APPLY(msg, p)
> org.apache.metron.profiler.StandAloneProfiler@9472c85
> [Stellar]>>> PROFILER_APPLY(msg, p)
> org.apache.metron.profiler.StandAloneProfiler@9472c85
> {code}
> Flush the profile, which will trigger execution of the `groupBy` expression.
> {code}
> [Stellar]>>> PROFILER_FLUSH(p)
> [!] Bad 'groupBy' expression: Unexpected type: expected=Object, actual=null,
> expression=DAY_OF_WEEK(TO_EPOCH_TIMESTAMP(timestamp, 'yyyy-MM-dd HH:mm:ss',
> 'GMT')), profile=calender-effects, entity=10.0.0.1
> org.apache.metron.stellar.dsl.ParseException: Bad 'groupBy' expression:
> Unexpected type: expected=Object, actual=null,
> expression=DAY_OF_WEEK(TO_EPOCH_TIMESTAMP(timestamp, 'yyyy-MM-dd HH:mm:ss',
> 'GMT')), profile=calender-effects, entity=10.0.0.1
> at
> org.apache.metron.profiler.DefaultProfileBuilder.execute(DefaultProfileBuilder.java:257)
> at
> org.apache.metron.profiler.DefaultProfileBuilder.flush(DefaultProfileBuilder.java:159)
> at
> org.apache.metron.profiler.DefaultMessageDistributor.lambda$flush$0(DefaultMessageDistributor.java:101)
> at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:114)
> at
> org.apache.metron.profiler.DefaultMessageDistributor.flush(DefaultMessageDistributor.java:99)
> at
> org.apache.metron.profiler.StandAloneProfiler.flush(StandAloneProfiler.java:82)
> at
> org.apache.metron.profiler.client.stellar.ProfilerFunctions$ProfilerFlush.apply(ProfilerFunctions.java:191)
> at
> org.apache.metron.stellar.common.StellarCompiler.lambda$exitTransformationFunc$13(StellarCompiler.java:556)
> at
> org.apache.metron.stellar.common.StellarCompiler$Expression.apply(StellarCompiler.java:160)
> at
> org.apache.metron.stellar.common.BaseStellarProcessor.parse(BaseStellarProcessor.java:152)
> at
> org.apache.metron.stellar.common.shell.StellarExecutor.execute(StellarExecutor.java:287)
> at
> org.apache.metron.stellar.common.shell.StellarShell.handleStellar(StellarShell.java:270)
> at
> org.apache.metron.stellar.common.shell.StellarShell.execute(StellarShell.java:409)
> at org.jboss.aesh.console.AeshProcess.run(AeshProcess.java:53)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: Unexpected type:
> expected=Object, actual=null,
> expression=DAY_OF_WEEK(TO_EPOCH_TIMESTAMP(timestamp, 'yyyy-MM-dd HH:mm:ss',
> 'GMT'))
> at
> org.apache.metron.stellar.common.DefaultStellarStatefulExecutor.execute(DefaultStellarStatefulExecutor.java:128)
> at
> org.apache.metron.profiler.DefaultProfileBuilder.lambda$execute$3(DefaultProfileBuilder.java:253)
> at java.util.ArrayList.forEach(ArrayList.java:1249)
> at
> org.apache.metron.profiler.DefaultProfileBuilder.execute(DefaultProfileBuilder.java:253)
> ... 16 more
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)