[ 
https://issues.apache.org/jira/browse/DRILL-7652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063351#comment-17063351
 ] 

ASF GitHub Bot commented on DRILL-7652:
---------------------------------------

cgivre commented on pull request #2033: DRILL-7652: Add time_bucket() function 
for time series analysis
URL: https://github.com/apache/drill/pull/2033
 
 
   # [DRILL-7652](https://issues.apache.org/jira/browse/DRILL-7652): Add 
time_bucket() function for Time Series Analysis
   
   ## Description
   
   This PR adds two UDFs which facilitate time series analysis.  This PR also 
includes updates to the `README.md` in the `contrib/udf` folder to reflect the 
new UDF.
   
   ## Documentation
   These functions are useful for doing time series analysis by grouping the 
data into arbitrary intervals.  See: 
https://blog.timescale.com/blog/simplified-time-series-analytics
   -using-the-time_bucket-function/ for more examples. 
   
   There are two versions of the function:
   * `time_bucket(<timestamp>, <interval>)`
   * `time_bucket_ns(<timestamp>,<interval>)`
   
   Both functions accept a `BIGINT` timestamp and an interval in milliseconds 
as arguments. The `time_bucket_ns()` function accepts timestamps in nanoseconds 
and `time_bucket
   ()` accepts timestamps in milliseconds.  Both return timestamps in the 
original format.
   
   ### Example:
   The query below calculates the average for the `cpu` metric for every five 
minute interval.
   
   ```sql
   SELECT time_bucket(time_stamp, 30000) AS five_min, avg(cpu)
     FROM metrics
     GROUP BY five_min
     ORDER BY five_min DESC LIMIT 12;
   ```
   
   ## Testing
   There are a series of unit tests included with this PR.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Add time_bucket() Function for Time Series Analysis
> ---------------------------------------------------
>
>                 Key: DRILL-7652
>                 URL: https://issues.apache.org/jira/browse/DRILL-7652
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.17.0
>            Reporter: Charles Givre
>            Priority: Major
>             Fix For: 1.18.0
>
>
> These functions are useful for doing time series analysis by grouping the 
> data into arbitrary intervals. See: 
> https://blog.timescale.com/blog/simplified-time-series-analytics
> -using-the-time_bucket-function/ for more examples. 
> There are two versions of the function:
> * `time_bucket(<timestamp>, <interval>)`
> * `time_bucket_ns(<timestamp>,<interval>)`
> Both functions accept a `BIGINT` timestamp and an interval in milliseconds as 
> arguments. The `time_bucket_ns()` function accepts timestamps in nanoseconds 
> and `time_bucket
> ()` accepts timestamps in milliseconds. Both return timestamps in the 
> original format.
> ### Example:
> The query below calculates the average for the `cpu` metric for every five 
> minute interval.
> ```sql
> SELECT time_bucket(time_stamp, 30000) AS five_min, avg(cpu)
>  FROM metrics
>  GROUP BY five_min
>  ORDER BY five_min DESC LIMIT 12;
> ```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to