cgivre opened a new pull request #2033: DRILL-7652: Add time_bucket() function 
for time series analysis
URL: https://github.com/apache/drill/pull/2033
 
 
   # [DRILL-7652](https://issues.apache.org/jira/browse/DRILL-7652): Add 
time_bucket() function for Time Series Analysis
   
   ## Description
   
   This PR adds two UDFs which facilitate time series analysis.  This PR also 
includes updates to the `README.md` in the `contrib/udf` folder to reflect the 
new UDF.
   
   ## Documentation
   These functions are useful for doing time series analysis by grouping the 
data into arbitrary intervals.  See: 
https://blog.timescale.com/blog/simplified-time-series-analytics
   -using-the-time_bucket-function/ for more examples. 
   
   There are two versions of the function:
   * `time_bucket(<timestamp>, <interval>)`
   * `time_bucket_ns(<timestamp>,<interval>)`
   
   Both functions accept a `BIGINT` timestamp and an interval in milliseconds 
as arguments. The `time_bucket_ns()` function accepts timestamps in nanoseconds 
and `time_bucket
   ()` accepts timestamps in milliseconds.  Both return timestamps in the 
original format.
   
   ### Example:
   The query below calculates the average for the `cpu` metric for every five 
minute interval.
   
   ```sql
   SELECT time_bucket(time_stamp, 30000) AS five_min, avg(cpu)
     FROM metrics
     GROUP BY five_min
     ORDER BY five_min DESC LIMIT 12;
   ```
   
   ## Testing
   There are a series of unit tests included with this PR.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to