[ 
https://issues.apache.org/jira/browse/METRON-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877177#comment-15877177
 ] 

ASF GitHub Bot commented on METRON-690:
---------------------------------------

Github user cestella commented on a diff in the pull request:

    https://github.com/apache/incubator-metron/pull/450#discussion_r102361685
  
    --- Diff: metron-analytics/metron-profiler-client/README.md ---
    @@ -91,37 +60,268 @@ want to change the global Client configuration so as 
not to disrupt the work of
     | profiler.client.salt.divisor          | The salt divisor used to store 
profile data.                                                                   
                    | Optional | 1000     |
     | hbase.provider.impl                   | The name of the 
HBaseTableProvider implementation class.                                        
                                   | Optional |          |
     
    +
    +### Profile Selectors
    +
    +You will notice that the third argument for `PROFILE_GET` is a list of 
`ProfilePeriod` objects.  This list is expected to
    +be produced by another Stellar function.  There are a couple options 
available.
    +
    +#### `PROFILE_FIXED`
    +
    +The profiler periods associated with a fixed lookback starting from now.  
These are ProfilePeriod objects.
    +```
    +REQUIRED:
    +    durationAgo - How long ago should values be retrieved from?
    +    units - The units of 'durationAgo'.
    +OPTIONAL:
    +    config_overrides - Optional - Map (in curly braces) of name:value 
pairs, each overriding the global config parameter
    +            of the same name. Default is the empty Map, meaning no 
overrides.
    +
    +e.g. To retrieve all the profiles for the last 5 hours.  
PROFILE_GET('profile', 'entity', PROFILE_FIXED(5, 'HOURS'))
    +```
    +
    +Note that the `config_overrides` parameter operates exactly as the 
`config_overrides` argument in `PROFILE_GET`.
    +The only available parameters for override are:
    +* `profiler.client.period.duration` 
    +* `profiler.client.period.duration.units`
    +
    +#### `PROFILE_WINDOW`
    +
    +`PROFILE_WINDOW` is intended to provide a finer-level of control over 
selecting windows for profiles:
    +* Specify windows relative to the data timestamp (see the optional `now` 
parameter below)
    +* Specify non-contiguous windows to better handle seasonal data (e.g. the 
last hour for every day for the last month)
    +* Specify profile output excluding holidays
    +* Specify only profile output on a specific day of the week
    +
    +It does this by a domain specific language mimicking natural language that 
defines the windows excluded.
    +
    +```
    +REQUIRED:
    +    windowSelector - The statement specifying the window to select.
    +    now - Optional - The timestamp to use for now.
    +OPTIONAL:
    +    config_overrides - Optional - Map (in curly braces) of name:value 
pairs, each overriding the global config parameter
    +            of the same name. Default is the empty Map, meaning no 
overrides.
    +
    +e.g. To retrieve all the measurements written for 'profile' and 'entity' 
for the last hour 
    +on the same weekday excluding weekends and US holidays across the last 14 
days: 
    +PROFILE_GET('profile', 'entity', PROFILE_WINDOW('1 hour window every 24 
hours starting from 14 days ago including the current day of the week excluding 
weekends, holidays:us'))
    +```
    +
    +Note that the `config_overrides` parameter operates exactly as the 
`config_overrides` argument in `PROFILE_GET`.
    +The only available parameters for override are:
    +* `profiler.client.period.duration`
    +* `profiler.client.period.duration.units`
    +
    +##### The Profile Selector Language
    +
    +The domain specific language can be broken into a series of clauses, some 
optional
    +* <span style="color:blue">Total Temporal Duration</span> - The total 
range of time in which windows may be specified
    +* <span style="color:red">Temporal Window Width</span> - How large each 
temporal window
    +* <span style="color:green">Skip distance</span> (optional)- How far to 
skip between when one window starts and when the next begins
    +* <span style="color:purple">Inclusion/Exclusion specifiers</span> 
(optional) - The set of specifiers to further filter the window
    +
    +One *must* specify either a total temporal duration or a temporal window 
width.
    +The remaining clauses are optional.
    +During the course of the following discussion, we will color code the 
clauses in the examples.
    +
    +From a high level, the language fits the following three forms:
    +
    +* <span style="color:red">`time_interval WINDOW?`</span><span 
style="color:purple">`(INCLUDING specifier_list)? (EXCLUDING 
specifier_list)?`</span>
    +* <span style="color:red">`time_interval WINDOW?`</span><span 
style="color:green">`EVERY time_interval`</span><span style="color:blue">`FROM 
time_interval (TO time_interval)?`</span><span style="color:purple">`(INCLUDING 
specifier_list)? (EXCLUDING specifier_list)?`</span>
    +* <span style="color:blue">`FROM time_interval (TO time_interval)?`</span>
    --- End diff --
    
    Actually, the color coding should link the three major forms with the types 
of clauses used to construct the major forms, so for 1 it becomes more clear if 
you look at it from the site-book as well.


> Create a DSL-based timestamp lookup for profiler to enable sparse windows
> -------------------------------------------------------------------------
>
>                 Key: METRON-690
>                 URL: https://issues.apache.org/jira/browse/METRON-690
>             Project: Metron
>          Issue Type: New Feature
>            Reporter: Casey Stella
>
> I propose that we support the following features:
> * A starting point that is not current time
> * Sparse bins (i.e. the last hour for every tuesday for the last month)
> * The ability to skip events (e.g. weekends, holidays)
> This would result in a new function with the following arguments:
> from - The lookback starting point (default to now)
> fromUnits - The units for the lookback starting point
> to - The ending point for the lookback window (default to from + binSize)
> toUnits - The units for the lookback ending point
> including - A list of conditions which we would skip.
> weekend
> holiday
> sunday through saturday
> excluding - A list of conditions which we would skip.
> weekend
> holiday
> sunday through saturday
> binSize - The size of the lookback bin
> binUnits - The units of the lookback bin
> Given the number of arguments and their complexity and the fact that many, 
> many are optional, 
> PROFILE_LOOKBACK accept a string backed by a DSL to express these criteria
> Base Case: A lookback of 1 hour ago
> PROFILE_LOOKBACK( '1 hour bins from now')
> Example 1: The same time window every tuesday for the last month starting one 
> hour ago
> Just to make this as clear as possible, if this is run at 3PM on Monday 
> January 23rd, 2017, it would include the following bins:
> January 17th, 2PM - 3PM
> January 10th, 2PM - 3PM
> January 3rd, 2PM - 3PM
> December 27th, 2PM - 3PM
> PROFILE_LOOKBACK( '1 hour bins from 1 hour to 1 month including tuesdays')
> Example 2: The same time window every sunday for the last month starting one 
> hour ago skipping holidays
> Just to make this as clear as possible, if this is run at 3PM on Monday 
> January 22rd, 2017, it would include the following bins:
> January 16th, 2PM - 3PM
> January 9th, 2PM - 3PM
> January 2rd, 2PM - 3PM
> NOT December 25th
> PROFILE_LOOKBACK( '1 hour bins from 1 hour to 1 month including tuesdays 
> excluding holidays')



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to