[ 
https://issues.apache.org/jira/browse/METRON-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877085#comment-15877085
 ] 

ASF GitHub Bot commented on METRON-690:
---------------------------------------

Github user cestella commented on a diff in the pull request:

    https://github.com/apache/incubator-metron/pull/450#discussion_r102355810
  
    --- Diff: metron-analytics/metron-profiler-client/README.md ---
    @@ -91,37 +60,268 @@ want to change the global Client configuration so as 
not to disrupt the work of
     | profiler.client.salt.divisor          | The salt divisor used to store 
profile data.                                                                   
                    | Optional | 1000     |
     | hbase.provider.impl                   | The name of the 
HBaseTableProvider implementation class.                                        
                                   | Optional |          |
     
    +
    +### Profile Selectors
    +
    +You will notice that the third argument for `PROFILE_GET` is a list of 
`ProfilePeriod` objects.  This list is expected to
    +be produced by another Stellar function.  There are a couple options 
available.
    +
    +#### `PROFILE_FIXED`
    +
    +The profiler periods associated with a fixed lookback starting from now.  
These are ProfilePeriod objects.
    +```
    +REQUIRED:
    +    durationAgo - How long ago should values be retrieved from?
    +    units - The units of 'durationAgo'.
    +OPTIONAL:
    +    config_overrides - Optional - Map (in curly braces) of name:value 
pairs, each overriding the global config parameter
    +            of the same name. Default is the empty Map, meaning no 
overrides.
    +
    +e.g. To retrieve all the profiles for the last 5 hours.  
PROFILE_GET('profile', 'entity', PROFILE_FIXED(5, 'HOURS'))
    +```
    +
    +Note that the `config_overrides` parameter operates exactly as the 
`config_overrides` argument in `PROFILE_GET`.
    +The only available parameters for override are:
    +* `profiler.client.period.duration` 
    +* `profiler.client.period.duration.units`
    +
    +#### `PROFILE_WINDOW`
    +
    +`PROFILE_WINDOW` is intended to provide a finer-level of control over 
selecting windows for profiles:
    +* Specify windows relative to the data timestamp (see the optional `now` 
parameter below)
    +* Specify non-contiguous windows to better handle seasonal data (e.g. the 
last hour for every day for the last month)
    +* Specify profile output excluding holidays
    +* Specify only profile output on a specific day of the week
    +
    +It does this by a domain specific language mimicking natural language that 
defines the windows excluded.
    +
    +```
    +REQUIRED:
    +    windowSelector - The statement specifying the window to select.
    +    now - Optional - The timestamp to use for now.
    +OPTIONAL:
    +    config_overrides - Optional - Map (in curly braces) of name:value 
pairs, each overriding the global config parameter
    +            of the same name. Default is the empty Map, meaning no 
overrides.
    +
    +e.g. To retrieve all the measurements written for 'profile' and 'entity' 
for the last hour 
    +on the same weekday excluding weekends and US holidays across the last 14 
days: 
    +PROFILE_GET('profile', 'entity', PROFILE_WINDOW('1 hour window every 24 
hours starting from 14 days ago including the current day of the week excluding 
weekends, holidays:us'))
    +```
    +
    +Note that the `config_overrides` parameter operates exactly as the 
`config_overrides` argument in `PROFILE_GET`.
    +The only available parameters for override are:
    +* `profiler.client.period.duration`
    +* `profiler.client.period.duration.units`
    +
    +##### The Profile Selector Language
    +
    +The domain specific language can be broken into a series of clauses, some 
optional
    +* <span style="color:blue">Total Temporal Duration</span> - The total 
range of time in which windows may be specified
    +* <span style="color:red">Temporal Window Width</span> - How large each 
temporal window
    +* <span style="color:green">Skip distance</span> (optional)- How far to 
skip between when one window starts and when the next begins
    +* <span style="color:purple">Inclusion/Exclusion specifiers</span> 
(optional) - The set of specifiers to further filter the window
    +
    +One *must* specify either a total temporal duration or a temporal window 
width.
    +The remaining clauses are optional.
    +During the course of the following discussion, we will color code the 
clauses in the examples.
    +
    +From a high level, the language fits the following three forms:
    +
    +* <span style="color:red">`time_interval WINDOW?`</span><span 
style="color:purple">`(INCLUDING specifier_list)? (EXCLUDING 
specifier_list)?`</span>
    +* <span style="color:red">`time_interval WINDOW?`</span><span 
style="color:green">`EVERY time_interval`</span><span style="color:blue">`FROM 
time_interval (TO time_interval)?`</span><span style="color:purple">`(INCLUDING 
specifier_list)? (EXCLUDING specifier_list)?`</span>
    +* <span style="color:blue">`FROM time_interval (TO time_interval)?`</span>
    +
    +with
    +* `time_interval` representing a time amount followed by a unit (e.g. "1 
hour")
    +* `specifier_list` representing a comma separated list of inclusion or 
exclusion specifiers (e.g. "holidays:us, tuesdays")
    +
    +
    +###### <span style="color:blue">Total Temporal Duration</span>
    +
    +Total temporal duration is specified by a phrase: `FROM time_interval AGO 
TO time_interval AGO`
    +This indicates the beginning and ending of a time interval.
    +* `FROM` - Can be the words "from" or "starting from"
    +* `time_interval` - A time amount followed by a unit (e.g. 1 hour).  The 
unit may be "minute", "day", "hour" with any pluralization.
    +* `TO` - Can be the words "until" or "to"
    +* `AGO` - Optionally the word "ago"
    +
    +The `TO time_interval AGO` portion is optional.  If unspecified then it is 
expected that the time interval ends now.
    +
    +Due to the vagaries of the english language, the from and the to portions, 
if both specified, are interchangeable
    +with regard to which one specifies the start and which specifies the end.  
    +
    +In other words <span style="color:blue">`starting from 1 hour ago to 30 
minutes ago`</span> and
    +<span style="color:blue">`starting from 30 minutes ago to 1 hour 
ago`</span> specify the same
    +temporal duration.
    +
    +**Examples**
    +
    +* A duration starting 1 hour ago and ending now
    +   * <span style="color:blue">`from 1 hour ago`</span>
    +   * <span style="color:blue">`from 1 hour`</span>
    +   * <span style="color:blue">`starting from 1 hour ago`</span>
    +   * <span style="color:blue">`starting from 1 hour`</span>
    +* A duration starting 1 hour ago and ending 30 minutes ago: 
    +   * <span style="color:blue">`from 1 hour ago until 30 minutes ago`</span>
    +   * <span style="color:blue">`from 30 minutes ago until 1 hour ago`</span>
    +   * <span style="color:blue">`starting from 1 hour ago to 30 minutes 
ago`</span>
    +   * <span style="color:blue">`starting from 1 hour to 30 minutes`</span>
    +
    +###### <span style="color:red">Temporal Window Width</span>
    +
    +Temporal window width is the specification of a window. 
    +A window is may either repeat within total temporal duration or may fill 
the total temporal duration.
    +A window is specified by the phrase: `time_interval WINDOW`
    +* `time_interval` - A time amount followed by a unit (e.g. 1 hour).  The 
unit may be "minute", "day", "hour" with any pluralization.
    +* `WINDOW` - Optionally the word "window"
    +
    +**Examples**
    +
    +* A fixed window starting 2 hours ago and going until now
    +  * <span style="color:red">`2 hour`</span>
    +  * <span style="color:red">`2 hours`</span>
    +  * <span style="color:red">`2 hours window`</span>
    +* A repeating 30 minute window starting 2 hours ago and repeating every 
hour until now.
    +This would result in 2 30-minute wide windows: 2 hours ago and 1 hour ago
    +  * <span style="color:red">`30 minute window`</span><span 
style="color:green">`every 1 hour`</span><span style="color:blue">`starting 
from 2 hours ago`</span>
    +  * <span style="color:red">`30 minutes window`</span><span 
style="color:green">`every 1 hour`</span><span style="color:blue">`from 2 hours 
ago`</span>
    +* A repeating 30 minute window starting 2 hours ago and repeating every 
hour until 30 minutes ago.
    +This would result in 2 30-minute wide windows: 2 hours ago and 1 hour ago
    +  * <span style="color:red">`30 minute window`</span><span 
style="color:green">`every 1 hour`</span><span style="color:blue">`starting 
from 2 hours ago until 30 minutes ago`</span>
    +  * <span style="color:red">`30 minutes window`</span><span 
style="color:green">`every 1 hour`</span><span style="color:blue">`from 2 hours 
ago to 30 minutes ago`</span> 
    +  * <span style="color:red">`30 minutes window`</span><span 
style="color:green">`for every 1 hour`</span><span style="color:blue">`from 30 
minutes ago to 2 hours ago`</span> 
    +
    +###### <span style="color:green">Skip distance</span>
    +
    +Skip distance is the amount of time between temporal window beginnings 
that the next window starts.
    +It is, in effect, the window period.  
    +
    +It is specified by the phrase `EVERY time_interval`
    +* `time_interval` - A time amount followed by a unit (e.g. 1 hour).  The 
unit may be "minute", "day", "hour" with any pluralization.
    +* `EVERY` - The word/phrase "every" or "for every"
    +
    +**Examples**
    +
    +* A repeating 30 minute window starting 2 hours ago and repeating every 
hour until now.
    --- End diff --
    
    good feedback, I'll add that.


> Create a DSL-based timestamp lookup for profiler to enable sparse windows
> -------------------------------------------------------------------------
>
>                 Key: METRON-690
>                 URL: https://issues.apache.org/jira/browse/METRON-690
>             Project: Metron
>          Issue Type: New Feature
>            Reporter: Casey Stella
>
> I propose that we support the following features:
> * A starting point that is not current time
> * Sparse bins (i.e. the last hour for every tuesday for the last month)
> * The ability to skip events (e.g. weekends, holidays)
> This would result in a new function with the following arguments:
> from - The lookback starting point (default to now)
> fromUnits - The units for the lookback starting point
> to - The ending point for the lookback window (default to from + binSize)
> toUnits - The units for the lookback ending point
> including - A list of conditions which we would skip.
> weekend
> holiday
> sunday through saturday
> excluding - A list of conditions which we would skip.
> weekend
> holiday
> sunday through saturday
> binSize - The size of the lookback bin
> binUnits - The units of the lookback bin
> Given the number of arguments and their complexity and the fact that many, 
> many are optional, 
> PROFILE_LOOKBACK accept a string backed by a DSL to express these criteria
> Base Case: A lookback of 1 hour ago
> PROFILE_LOOKBACK( '1 hour bins from now')
> Example 1: The same time window every tuesday for the last month starting one 
> hour ago
> Just to make this as clear as possible, if this is run at 3PM on Monday 
> January 23rd, 2017, it would include the following bins:
> January 17th, 2PM - 3PM
> January 10th, 2PM - 3PM
> January 3rd, 2PM - 3PM
> December 27th, 2PM - 3PM
> PROFILE_LOOKBACK( '1 hour bins from 1 hour to 1 month including tuesdays')
> Example 2: The same time window every sunday for the last month starting one 
> hour ago skipping holidays
> Just to make this as clear as possible, if this is run at 3PM on Monday 
> January 22rd, 2017, it would include the following bins:
> January 16th, 2PM - 3PM
> January 9th, 2PM - 3PM
> January 2rd, 2PM - 3PM
> NOT December 25th
> PROFILE_LOOKBACK( '1 hour bins from 1 hour to 1 month including tuesdays 
> excluding holidays')



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to