[
https://issues.apache.org/jira/browse/METRON-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877077#comment-15877077
]
ASF GitHub Bot commented on METRON-690:
---------------------------------------
Github user nickwallen commented on a diff in the pull request:
https://github.com/apache/incubator-metron/pull/450#discussion_r102354651
--- Diff: metron-analytics/metron-profiler-client/README.md ---
@@ -91,37 +60,268 @@ want to change the global Client configuration so as
not to disrupt the work of
| profiler.client.salt.divisor | The salt divisor used to store
profile data.
| Optional | 1000 |
| hbase.provider.impl | The name of the
HBaseTableProvider implementation class.
| Optional | |
+
+### Profile Selectors
+
+You will notice that the third argument for `PROFILE_GET` is a list of
`ProfilePeriod` objects. This list is expected to
+be produced by another Stellar function. There are a couple options
available.
+
+#### `PROFILE_FIXED`
+
+The profiler periods associated with a fixed lookback starting from now.
These are ProfilePeriod objects.
+```
+REQUIRED:
+ durationAgo - How long ago should values be retrieved from?
+ units - The units of 'durationAgo'.
+OPTIONAL:
+ config_overrides - Optional - Map (in curly braces) of name:value
pairs, each overriding the global config parameter
+ of the same name. Default is the empty Map, meaning no
overrides.
+
+e.g. To retrieve all the profiles for the last 5 hours.
PROFILE_GET('profile', 'entity', PROFILE_FIXED(5, 'HOURS'))
+```
+
+Note that the `config_overrides` parameter operates exactly as the
`config_overrides` argument in `PROFILE_GET`.
+The only available parameters for override are:
+* `profiler.client.period.duration`
+* `profiler.client.period.duration.units`
+
+#### `PROFILE_WINDOW`
+
+`PROFILE_WINDOW` is intended to provide a finer-level of control over
selecting windows for profiles:
+* Specify windows relative to the data timestamp (see the optional `now`
parameter below)
+* Specify non-contiguous windows to better handle seasonal data (e.g. the
last hour for every day for the last month)
+* Specify profile output excluding holidays
+* Specify only profile output on a specific day of the week
+
+It does this by a domain specific language mimicking natural language that
defines the windows excluded.
+
+```
+REQUIRED:
+ windowSelector - The statement specifying the window to select.
+ now - Optional - The timestamp to use for now.
+OPTIONAL:
+ config_overrides - Optional - Map (in curly braces) of name:value
pairs, each overriding the global config parameter
+ of the same name. Default is the empty Map, meaning no
overrides.
+
+e.g. To retrieve all the measurements written for 'profile' and 'entity'
for the last hour
+on the same weekday excluding weekends and US holidays across the last 14
days:
+PROFILE_GET('profile', 'entity', PROFILE_WINDOW('1 hour window every 24
hours starting from 14 days ago including the current day of the week excluding
weekends, holidays:us'))
+```
+
+Note that the `config_overrides` parameter operates exactly as the
`config_overrides` argument in `PROFILE_GET`.
+The only available parameters for override are:
+* `profiler.client.period.duration`
+* `profiler.client.period.duration.units`
+
+##### The Profile Selector Language
+
+The domain specific language can be broken into a series of clauses, some
optional
+* <span style="color:blue">Total Temporal Duration</span> - The total
range of time in which windows may be specified
+* <span style="color:red">Temporal Window Width</span> - How large each
temporal window
+* <span style="color:green">Skip distance</span> (optional)- How far to
skip between when one window starts and when the next begins
+* <span style="color:purple">Inclusion/Exclusion specifiers</span>
(optional) - The set of specifiers to further filter the window
+
+One *must* specify either a total temporal duration or a temporal window
width.
+The remaining clauses are optional.
+During the course of the following discussion, we will color code the
clauses in the examples.
+
+From a high level, the language fits the following three forms:
+
+* <span style="color:red">`time_interval WINDOW?`</span><span
style="color:purple">`(INCLUDING specifier_list)? (EXCLUDING
specifier_list)?`</span>
+* <span style="color:red">`time_interval WINDOW?`</span><span
style="color:green">`EVERY time_interval`</span><span style="color:blue">`FROM
time_interval (TO time_interval)?`</span><span style="color:purple">`(INCLUDING
specifier_list)? (EXCLUDING specifier_list)?`</span>
+* <span style="color:blue">`FROM time_interval (TO time_interval)?`</span>
--- End diff --
I'll have to checkout the site book. Its too bad, Github doesn't render
that.
> Create a DSL-based timestamp lookup for profiler to enable sparse windows
> -------------------------------------------------------------------------
>
> Key: METRON-690
> URL: https://issues.apache.org/jira/browse/METRON-690
> Project: Metron
> Issue Type: New Feature
> Reporter: Casey Stella
>
> I propose that we support the following features:
> * A starting point that is not current time
> * Sparse bins (i.e. the last hour for every tuesday for the last month)
> * The ability to skip events (e.g. weekends, holidays)
> This would result in a new function with the following arguments:
> from - The lookback starting point (default to now)
> fromUnits - The units for the lookback starting point
> to - The ending point for the lookback window (default to from + binSize)
> toUnits - The units for the lookback ending point
> including - A list of conditions which we would skip.
> weekend
> holiday
> sunday through saturday
> excluding - A list of conditions which we would skip.
> weekend
> holiday
> sunday through saturday
> binSize - The size of the lookback bin
> binUnits - The units of the lookback bin
> Given the number of arguments and their complexity and the fact that many,
> many are optional,
> PROFILE_LOOKBACK accept a string backed by a DSL to express these criteria
> Base Case: A lookback of 1 hour ago
> PROFILE_LOOKBACK( '1 hour bins from now')
> Example 1: The same time window every tuesday for the last month starting one
> hour ago
> Just to make this as clear as possible, if this is run at 3PM on Monday
> January 23rd, 2017, it would include the following bins:
> January 17th, 2PM - 3PM
> January 10th, 2PM - 3PM
> January 3rd, 2PM - 3PM
> December 27th, 2PM - 3PM
> PROFILE_LOOKBACK( '1 hour bins from 1 hour to 1 month including tuesdays')
> Example 2: The same time window every sunday for the last month starting one
> hour ago skipping holidays
> Just to make this as clear as possible, if this is run at 3PM on Monday
> January 22rd, 2017, it would include the following bins:
> January 16th, 2PM - 3PM
> January 9th, 2PM - 3PM
> January 2rd, 2PM - 3PM
> NOT December 25th
> PROFILE_LOOKBACK( '1 hour bins from 1 hour to 1 month including tuesdays
> excluding holidays')
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)