Charles Givre created DRILL-7077:
------------------------------------
Summary: Add Function to Facilitate Time Series Analysis
Key: DRILL-7077
URL: https://issues.apache.org/jira/browse/DRILL-7077
Project: Apache Drill
Issue Type: New Feature
Affects Versions: 1.16.0
Reporter: Charles Givre
Assignee: Charles Givre
When analyzing time based data, you will often have to aggregate by time
grains. While some time grains will be easy to calculate, others, such as
quarter, can be quite difficult. These functions enable a user to quickly and
easily aggregate data by various units of time. Usage is as follows:
{code:java}
SELECT <fields>
FROM <data>
GROUP BY nearestDate(<timestamp_column>, <time increment>{code}
So let's say that a user wanted to count the number of hits on a web server per
15 minute, the query might look like this:
{code:java}
SELECT nearestDate(`eventDate`, '15MINUTE' ) AS eventDate,
COUNT(*) AS hitCount
FROM dfs.`log.httpd`
GROUP BY nearestDate(`eventDate`, '15MINUTE'){code}
Currently supports the following time units:
* YEAR
* QUARTER
* MONTH
* WEEK_SUNDAY
* WEEK_MONDAY
* DAY
* HOUR
* HALF_HOUR / 30MIN
* QUARTER_HOUR / 15MIN
* MINUTE
* 30SECOND
* 15SECOND
* SECOND
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)