[ 
https://issues.apache.org/jira/browse/METRON-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16133543#comment-16133543
 ] 

ASF GitHub Bot commented on METRON-1120:
----------------------------------------

GitHub user nickwallen opened a pull request:

    https://github.com/apache/metron/pull/708

    Metron 1120

    [METRON-1120](https://issues.apache.org/jira/browse/METRON-1120)
    
    - [ ] This is built on top of METRON-1120 so this should not be committed 
before METRON-1120.
    
    The `groupBy` expression can now reference any of these variables.
    * `profile` The name of the profile.
    * `entity` The name of the entity being profiled.
    * `start` The start time of the profile period in epoch milliseconds.
    * `end` The end time of the profile period in epoch milliseconds.
    * `duration` The duration of the profile period in milliseconds.
    * `result` The result of executing the `result` expression.
    
    Unit tests have been added to validate this functionality. The README has 
also been updated to describe the fields available to the `groupBy` expression.
    
    This can also be tested manually in either a live Profiler or using the 
Profiler debugging functions. 
     The following shows how this change would be used to implement the 
problematic profile described in.
    
    Create a profile that references the start of the profile period in the 
`groupBy` expression.
    ```
    [Stellar]>>> conf := SHELL_EDIT()
    [Stellar]>>> conf
    {
      "profiles": [
        {
          "profile": "calender-effects",
          "onlyif":  "exists(ip_src_addr) and exists(timestamp)",
          "foreach": "ip_src_addr",
          "init":    { "count": 0 },
          "update":  { "count": "count + 1" },
          "result":  "count",
          "groupBy": ["DAY_OF_WEEK(start)"]
        }
      ]
    }
    ```
    
    Create a message to exercise the profiler.
    ```
    [Stellar]>>> msg := SHELL_EDIT()
    [Stellar]>>> msg
    {
        "ip_src_addr":"10.0.0.1",
        "timestamp":"2017-08-18 09:00:00"
    }
    ```
    
    Create a Profiler and apply the messages to it.
    ```
    [Stellar]>>> p := PROFILER_INIT(conf)
    [Stellar]>>> PROFILER_APPLY(msg, p)
    org.apache.metron.profiler.StandAloneProfiler@4572b5b4
    [Stellar]>>> PROFILER_APPLY(msg, p)
    org.apache.metron.profiler.StandAloneProfiler@4572b5b4
    [Stellar]>>> PROFILER_APPLY(msg, p)
    org.apache.metron.profiler.StandAloneProfiler@4572b5b4
    ```
    
    Flush the profile and validate the result of executing the `groupBy`.  The 
value is 6, which indicates Friday, which is correct in this case.
    ```
    [Stellar]>>> PROFILER_FLUSH(p)
    [{period={duration=900000, period=1670094, start=1503084600000, 
end=1503085500000}, profile=calender-effects, groups=[6], value=3, 
entity=10.0.0.1}]
    ```


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/nickwallen/metron METRON-1120

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/metron/pull/708.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #708
    
----
commit 5579748ad4336a7c1a15c319d59fd6cbdeb6531e
Author: Nick Allen <[email protected]>
Date:   2017-08-18T17:37:01Z

    METRON-1121 Ignore Profile with Bad 'init', 'update' or 'groupBy'

commit 893b7db84f155ea6af975ee51338f39b763eaedb
Author: Nick Allen <[email protected]>
Date:   2017-08-18T17:45:50Z

    Rm errant comment

commit da365c8b546678bbe07011e10ab3cd222faa8297
Author: Nick Allen <[email protected]>
Date:   2017-08-18T19:01:26Z

    METRON-1120 Profile's 'groupBy' Expression Has No Reference to Time

commit 5d8a7a06096d5aa725a0ce3b47fef36a8e14ac72
Author: Nick Allen <[email protected]>
Date:   2017-08-18T19:04:35Z

    Rm artifacts that should not be in Git

commit c52dce2be6146127eed9af0d2b311ff65f0de551
Author: Nick Allen <[email protected]>
Date:   2017-08-18T19:32:07Z

    Updated README

commit 54f1c5969268032e0841f0dd4b5e76449b8b3b6f
Author: Nick Allen <[email protected]>
Date:   2017-08-18T19:35:31Z

    Fix README

----


> Profile's 'groupBy' Expression Has No Reference to Time
> -------------------------------------------------------
>
>                 Key: METRON-1120
>                 URL: https://issues.apache.org/jira/browse/METRON-1120
>             Project: Metron
>          Issue Type: Bug
>            Reporter: Nick Allen
>            Assignee: Nick Allen
>
> It is often the case that patterns and behaviors will differ based on 
> calendar effects like day of week. For example, activity on a weekday can be 
> very different from a weekend. The Profiler's "Group By" functionality is one 
> way to account for calendar effects.
> This profile definition operates over any incoming telemetry that has an 
> `ip_src_addr` and a `timestamp` field. It produces a profile that segments 
> the data by day of week. It does by using a 'groupBy' expression to extract 
> the day of week from the telemetry's `timestamp` field.
> {code}
> {
>   "profiles": [
>     {
>       "profile": "calender-effects",
>       "onlyif":  "exists(ip_src_addr) and exists(timestamp)",
>       "foreach": "ip_src_addr",
>       "init":    { "count": 0 },
>       "update":  { "count": "count + 1" },
>       "result":  "count",
>       "groupBy": ["DAY_OF_WEEK(TO_EPOCH_TIMESTAMP(timestamp, 'yyyy-MM-dd 
> HH:mm:ss', 'GMT'))"]
>     }
>   ]
> }
> {code}
> When retrieving profile data using the Profiler Client API, I only want to 
> retrieve data from the same day of week to account for any calendar effects. 
> The following example retrieves profile data only for Thursdays over the past 
> 60 days.
> {code}
> >>> thursday := 5
> >>> PROFILE_GET("calendar-effects", "10.0.0.1", PROFILE_FIXED(60, "DAYS"), 
> >>> [thursday])
> {code}
> h3. The Problem
> The `groupBy` expression only has access to the Profile's `result` value.  It 
> does not have any way to reference the current tick time in the Profiler.  
> Here is an example showing the problem.
> Define the profile and a message.
> {code}
> [Stellar]>>> conf
> {
>   "profiles": [
>     {
>       "profile": "calender-effects",
>       "onlyif":  "exists(ip_src_addr) and exists(timestamp)",
>       "foreach": "ip_src_addr",
>       "init":    { "count": "0" },
>       "update":  { "count": "count + 1" },
>       "result":  "count",
>       "groupBy": ["DAY_OF_WEEK(TO_EPOCH_TIMESTAMP(timestamp, 'yyyy-MM-dd 
> HH:mm:ss', 'GMT'))"]
>     }
>   ]
> }
> [Stellar]>>> msg
> {
>      "ip_src_addr": "10.0.0.1",
>      "protocol": "HTTPS",
>      "length": "10",
>      "bytes_in": 234,
>      "timestamp": "2017-08-17 09:00:00"
> }
> {code}
> Initialize the Profiler and apply the message a few times.
> {code}
> [Stellar]>>> p := PROFILER_INIT(conf)
> [Stellar]>>> PROFILER_APPLY(msg, p)
> org.apache.metron.profiler.StandAloneProfiler@9472c85
> [Stellar]>>> PROFILER_APPLY(msg, p)
> org.apache.metron.profiler.StandAloneProfiler@9472c85
> [Stellar]>>> PROFILER_APPLY(msg, p)
> org.apache.metron.profiler.StandAloneProfiler@9472c85
> {code}
> Flush the profile, which will trigger execution of the `groupBy` expression.
> {code}
> [Stellar]>>> PROFILER_FLUSH(p)
> [!] Bad 'groupBy' expression: Unexpected type: expected=Object, actual=null, 
> expression=DAY_OF_WEEK(TO_EPOCH_TIMESTAMP(timestamp, 'yyyy-MM-dd HH:mm:ss', 
> 'GMT')), profile=calender-effects, entity=10.0.0.1
> org.apache.metron.stellar.dsl.ParseException: Bad 'groupBy' expression: 
> Unexpected type: expected=Object, actual=null, 
> expression=DAY_OF_WEEK(TO_EPOCH_TIMESTAMP(timestamp, 'yyyy-MM-dd HH:mm:ss', 
> 'GMT')), profile=calender-effects, entity=10.0.0.1
>       at 
> org.apache.metron.profiler.DefaultProfileBuilder.execute(DefaultProfileBuilder.java:257)
>       at 
> org.apache.metron.profiler.DefaultProfileBuilder.flush(DefaultProfileBuilder.java:159)
>       at 
> org.apache.metron.profiler.DefaultMessageDistributor.lambda$flush$0(DefaultMessageDistributor.java:101)
>       at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:114)
>       at 
> org.apache.metron.profiler.DefaultMessageDistributor.flush(DefaultMessageDistributor.java:99)
>       at 
> org.apache.metron.profiler.StandAloneProfiler.flush(StandAloneProfiler.java:82)
>       at 
> org.apache.metron.profiler.client.stellar.ProfilerFunctions$ProfilerFlush.apply(ProfilerFunctions.java:191)
>       at 
> org.apache.metron.stellar.common.StellarCompiler.lambda$exitTransformationFunc$13(StellarCompiler.java:556)
>       at 
> org.apache.metron.stellar.common.StellarCompiler$Expression.apply(StellarCompiler.java:160)
>       at 
> org.apache.metron.stellar.common.BaseStellarProcessor.parse(BaseStellarProcessor.java:152)
>       at 
> org.apache.metron.stellar.common.shell.StellarExecutor.execute(StellarExecutor.java:287)
>       at 
> org.apache.metron.stellar.common.shell.StellarShell.handleStellar(StellarShell.java:270)
>       at 
> org.apache.metron.stellar.common.shell.StellarShell.execute(StellarShell.java:409)
>       at org.jboss.aesh.console.AeshProcess.run(AeshProcess.java:53)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: Unexpected type: 
> expected=Object, actual=null, 
> expression=DAY_OF_WEEK(TO_EPOCH_TIMESTAMP(timestamp, 'yyyy-MM-dd HH:mm:ss', 
> 'GMT'))
>       at 
> org.apache.metron.stellar.common.DefaultStellarStatefulExecutor.execute(DefaultStellarStatefulExecutor.java:128)
>       at 
> org.apache.metron.profiler.DefaultProfileBuilder.lambda$execute$3(DefaultProfileBuilder.java:253)
>       at java.util.ArrayList.forEach(ArrayList.java:1249)
>       at 
> org.apache.metron.profiler.DefaultProfileBuilder.execute(DefaultProfileBuilder.java:253)
>       ... 16 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to