[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-04-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17081350#comment-17081350
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

asfgit commented on pull request #2040: DRILL-7668: Allow Time Bucket Function 
to Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-04-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076622#comment-17076622
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

cgivre commented on issue #2040: DRILL-7668: Allow Time Bucket Function to 
Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#issuecomment-610011770
 
 
   Thanks @vvysotskyi for the review.
   Commits squashed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-04-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076600#comment-17076600
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

cgivre commented on pull request #2040: DRILL-7668: Allow Time Bucket Function 
to Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#discussion_r404339305
 
 

 ##
 File path: 
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/TimeBucketFunctions.java
 ##
 @@ -97,9 +99,86 @@ public void eval() {
   long timestamp = inputDate.value;
 
   // Get the interval in milliseconds
-  long intervalToAdd = interval.value;
+  long groupByInterval = interval.value;
 
-  out.value = timestamp - (timestamp % intervalToAdd);
+  out.value = timestamp - (timestamp % groupByInterval);
+}
+  }
+
+  /**
+   * This function is used for facilitating time series analysis by creating 
buckets of time intervals.  See
+   * 
https://blog.timescale.com/blog/simplified-time-series-analytics-using-the-time_bucket-function/
 for usage. The function takes two arguments:
+   * 1. The timestamp (as a Drill timestamp)
+   * 2. The desired bucket interval IN milliseconds
+   *
+   * The function returns a BIGINT of the nearest time bucket.
+   */
+  @FunctionTemplate(name = "time_bucket",
+scope = FunctionTemplate.FunctionScope.SIMPLE,
+nulls = FunctionTemplate.NullHandling.NULL_IF_NULL)
+  public static class TimestampTimeBucketFunction implements DrillSimpleFunc {
+
+@Param
+TimeStampHolder inputDate;
+
+@Param
+BigIntHolder interval;
+
+@Output
+TimeStampHolder out;
+
+@Override
+public void setup() {
+}
+
+@Override
+public void eval() {
+  // Get the timestamp in milliseconds
+  long timestamp = inputDate.value;
+
+  // Get the interval in milliseconds
+  long groupByInterval = interval.value;
+
+  java.time.Instant instant = java.time.Instant.ofEpochMilli(timestamp - 
(timestamp % groupByInterval));
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-04-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076058#comment-17076058
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

vvysotskyi commented on pull request #2040: DRILL-7668: Allow Time Bucket 
Function to Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#discussion_r403849368
 
 

 ##
 File path: 
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/TimeBucketFunctions.java
 ##
 @@ -97,9 +99,86 @@ public void eval() {
   long timestamp = inputDate.value;
 
   // Get the interval in milliseconds
-  long intervalToAdd = interval.value;
+  long groupByInterval = interval.value;
 
-  out.value = timestamp - (timestamp % intervalToAdd);
+  out.value = timestamp - (timestamp % groupByInterval);
+}
+  }
+
+  /**
+   * This function is used for facilitating time series analysis by creating 
buckets of time intervals.  See
+   * 
https://blog.timescale.com/blog/simplified-time-series-analytics-using-the-time_bucket-function/
 for usage. The function takes two arguments:
+   * 1. The timestamp (as a Drill timestamp)
+   * 2. The desired bucket interval IN milliseconds
+   *
+   * The function returns a BIGINT of the nearest time bucket.
+   */
+  @FunctionTemplate(name = "time_bucket",
+scope = FunctionTemplate.FunctionScope.SIMPLE,
+nulls = FunctionTemplate.NullHandling.NULL_IF_NULL)
+  public static class TimestampTimeBucketFunction implements DrillSimpleFunc {
+
+@Param
+TimeStampHolder inputDate;
+
+@Param
+BigIntHolder interval;
+
+@Output
+TimeStampHolder out;
+
+@Override
+public void setup() {
+}
+
+@Override
+public void eval() {
+  // Get the timestamp in milliseconds
+  long timestamp = inputDate.value;
+
+  // Get the interval in milliseconds
+  long groupByInterval = interval.value;
+
+  java.time.Instant instant = java.time.Instant.ofEpochMilli(timestamp - 
(timestamp % groupByInterval));
 
 Review comment:
   Looks like creating `Instant` here may be also omitted.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-04-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076000#comment-17076000
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

cgivre commented on issue #2040: DRILL-7668: Allow Time Bucket Function to 
Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#issuecomment-609528521
 
 
   @vvysotskyi 
   Thanks for the review.  I made the requested changes to the PR.  If this is 
ok, I'll squash commits. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-04-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17075995#comment-17075995
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

cgivre commented on pull request #2040: DRILL-7668: Allow Time Bucket Function 
to Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#discussion_r403792240
 
 

 ##
 File path: 
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/TimeBucketFunctions.java
 ##
 @@ -97,9 +99,88 @@ public void eval() {
   long timestamp = inputDate.value;
 
   // Get the interval in milliseconds
-  long intervalToAdd = interval.value;
+  long groupByInterval = interval.value;
 
-  out.value = timestamp - (timestamp % intervalToAdd);
+  out.value = timestamp - (timestamp % groupByInterval);
+}
+  }
+
+  /**
+   * This function is used for facilitating time series analysis by creating 
buckets of time intervals.  See
+   * 
https://blog.timescale.com/blog/simplified-time-series-analytics-using-the-time_bucket-function/
 for usage. The function takes two arguments:
+   * 1. The timestamp (as a Drill timestamp)
+   * 2. The desired bucket interval IN milliseconds
+   *
+   * The function returns a BIGINT of the nearest time bucket.
+   */
+  @FunctionTemplate(name = "time_bucket",
+scope = FunctionTemplate.FunctionScope.SIMPLE,
+nulls = FunctionTemplate.NullHandling.NULL_IF_NULL)
+  public static class TimestampTimeBucketFunction implements DrillSimpleFunc {
+
+@Param
+TimeStampHolder inputDate;
+
+@Param
+BigIntHolder interval;
+
+@Output
+TimeStampHolder out;
+
+@Override
+public void setup() {
+}
+
+@Override
+public void eval() {
+  // Get the timestamp in milliseconds
+  long timestamp = inputDate.value;
+
+  // Get the interval in milliseconds
+  long groupByInterval = interval.value;
+
+  java.time.Instant instant = java.time.Instant.ofEpochMilli(timestamp - 
(timestamp % groupByInterval));
+  java.time.LocalDateTime localDate = 
instant.atZone(java.time.ZoneId.of("UTC")).toLocalDateTime();
+
+  out.value = 
localDate.atZone(java.time.ZoneId.of("UTC")).toInstant().toEpochMilli();
 
 Review comment:
   Nope.  Not sure why I did that, but I simplified and tested this with a few 
different timezones on my computer. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-04-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17075994#comment-17075994
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

cgivre commented on pull request #2040: DRILL-7668: Allow Time Bucket Function 
to Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#discussion_r403792155
 
 

 ##
 File path: 
contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestTimeBucketFunction.java
 ##
 @@ -92,6 +94,28 @@ public void testTimeBucket() throws Exception {
   .go();
   }
 
+  @Test
+  public void testDoubleTimeBucket() throws Exception {
+String query = "SELECT time_bucket(CAST(1451606760 AS DOUBLE), 30) AS 
high FROM (values(1))";
+testBuilder()
+  .sqlQuery(query)
+  .ordered()
+  .baselineColumns("high")
+  .baselineValues(145140L)
+  .go();
+  }
+
+  @Test
+  public void testTimeBucketTimestamp() throws Exception {
+String query = "SELECT time_bucket(CAST(1585272833845 AS TIMESTAMP), 
30) AS high FROM (values(1))";
 
 Review comment:
   Removed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-04-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17075889#comment-17075889
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

vvysotskyi commented on pull request #2040: DRILL-7668: Allow Time Bucket 
Function to Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#discussion_r403733469
 
 

 ##
 File path: 
contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestTimeBucketFunction.java
 ##
 @@ -92,6 +94,28 @@ public void testTimeBucket() throws Exception {
   .go();
   }
 
+  @Test
+  public void testDoubleTimeBucket() throws Exception {
+String query = "SELECT time_bucket(CAST(1451606760 AS DOUBLE), 30) AS 
high FROM (values(1))";
+testBuilder()
+  .sqlQuery(query)
+  .ordered()
+  .baselineColumns("high")
+  .baselineValues(145140L)
+  .go();
+  }
+
+  @Test
+  public void testTimeBucketTimestamp() throws Exception {
+String query = "SELECT time_bucket(CAST(1585272833845 AS TIMESTAMP), 
30) AS high FROM (values(1))";
 
 Review comment:
   Could you please replace it with string representation, it looks more 
friendly and helps to understand the source value:
   ```suggestion
   String query = "SELECT time_bucket(timestamp '2020-03-27 01:33:53.845', 
30) AS high";
   ```
   
   Also, there is no need to specify `FROM (values(1))` if it is not used.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-04-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17075873#comment-17075873
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

vvysotskyi commented on issue #2040: DRILL-7668: Allow Time Bucket Function to 
Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#issuecomment-609444297
 
 
   @cgivre, @arina-ielchiieva, sorry for the delay, I have missed a letter 
about this PR earlier.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-04-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17075872#comment-17075872
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

vvysotskyi commented on pull request #2040: DRILL-7668: Allow Time Bucket 
Function to Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#discussion_r403725422
 
 

 ##
 File path: 
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/TimeBucketFunctions.java
 ##
 @@ -97,9 +99,88 @@ public void eval() {
   long timestamp = inputDate.value;
 
   // Get the interval in milliseconds
-  long intervalToAdd = interval.value;
+  long groupByInterval = interval.value;
 
-  out.value = timestamp - (timestamp % intervalToAdd);
+  out.value = timestamp - (timestamp % groupByInterval);
+}
+  }
+
+  /**
+   * This function is used for facilitating time series analysis by creating 
buckets of time intervals.  See
+   * 
https://blog.timescale.com/blog/simplified-time-series-analytics-using-the-time_bucket-function/
 for usage. The function takes two arguments:
+   * 1. The timestamp (as a Drill timestamp)
+   * 2. The desired bucket interval IN milliseconds
+   *
+   * The function returns a BIGINT of the nearest time bucket.
+   */
+  @FunctionTemplate(name = "time_bucket",
+scope = FunctionTemplate.FunctionScope.SIMPLE,
+nulls = FunctionTemplate.NullHandling.NULL_IF_NULL)
+  public static class TimestampTimeBucketFunction implements DrillSimpleFunc {
+
+@Param
+TimeStampHolder inputDate;
+
+@Param
+BigIntHolder interval;
+
+@Output
+TimeStampHolder out;
+
+@Override
+public void setup() {
+}
+
+@Override
+public void eval() {
+  // Get the timestamp in milliseconds
+  long timestamp = inputDate.value;
+
+  // Get the interval in milliseconds
+  long groupByInterval = interval.value;
+
+  java.time.Instant instant = java.time.Instant.ofEpochMilli(timestamp - 
(timestamp % groupByInterval));
+  java.time.LocalDateTime localDate = 
instant.atZone(java.time.ZoneId.of("UTC")).toLocalDateTime();
+
+  out.value = 
localDate.atZone(java.time.ZoneId.of("UTC")).toInstant().toEpochMilli();
 
 Review comment:
   @cgivre, could you please explain, what happens here? Initially, you 
calculate the required milliseconds, after that creates `Instant` instance 
based on that, converts it to `LocalDateTime` at `UTC` timezone, converts it to 
`ZonedDateTime`, converts it to `Instant` and after that converts back to 
milliseconds.
   
   Are all these transformations required? Usually, UDF shouldn't apply 
timezone to the values they handle.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-04-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17075813#comment-17075813
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

arina-ielchiieva commented on issue #2040: DRILL-7668: Allow Time Bucket 
Function to Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#issuecomment-609407971
 
 
   @vvysotskyi is this PR ready to be merged?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-03-31 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071899#comment-17071899
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

cgivre commented on issue #2040: DRILL-7668: Allow Time Bucket Function to 
Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#issuecomment-606700977
 
 
   @vvysotskyi 
   Thanks for checking this.  I made a small fix and ran the unit tests on my 
local machine with the system set to different timezones, and they passed.  
Please let me know if this solves the issue and I'll squash commits. 
   Thanks
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-03-31 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071545#comment-17071545
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

vvysotskyi commented on issue #2040: DRILL-7668: Allow Time Bucket Function to 
Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#issuecomment-606196720
 
 
   @cgivre, this pull request introduces unit tests failure:
   ```
   [ERROR] 
org.apache.drill.exec.udfs.TestTimeBucketFunction.testTimeBucketTimestamp  Time 
elapsed: 2.584 s  <<< ERROR!
   java.lang.Exception: 
   at position 0 column '`high`' mismatched values, expected: 
2020-03-27T01:30(LocalDateTime) but received 2020-03-26T18:30(LocalDateTime)
   
   Expected Records near verification failure:
   Record Number: 0 { `high` : 2020-03-27T01:30, }
   
   
   Actual Records near verification failure:
   Record Number: 0 { `high` : 2020-03-26T18:30, }
   
   For query: SELECT time_bucket(CAST(1585272833845 AS TIMESTAMP), 30) AS 
high FROM (values(1))
at 
org.apache.drill.exec.udfs.TestTimeBucketFunction.testTimeBucketTimestamp(TestTimeBucketFunction.java:116)
   Caused by: java.lang.Exception: 
   at position 0 column '`high`' mismatched values, expected: 
2020-03-27T01:30(LocalDateTime) but received 2020-03-26T18:30(LocalDateTime)
   
   Expected Records near verification failure:
   Record Number: 0 { `high` : 2020-03-27T01:30, }
   
   
   Actual Records near verification failure:
   Record Number: 0 { `high` : 2020-03-26T18:30, }
   
at 
org.apache.drill.exec.udfs.TestTimeBucketFunction.testTimeBucketTimestamp(TestTimeBucketFunction.java:116)
   Caused by: java.lang.Exception: at position 0 column '`high`' mismatched 
values, expected: 2020-03-27T01:30(LocalDateTime) but received 
2020-03-26T18:30(LocalDateTime)
at 
org.apache.drill.exec.udfs.TestTimeBucketFunction.testTimeBucketTimestamp(TestTimeBucketFunction.java:116)
   ```
   Looks like it may be timezone-dependent failure.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-03-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071216#comment-17071216
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

vvysotskyi commented on issue #2040: DRILL-7668: Allow Time Bucket Function to 
Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#issuecomment-606196720
 
 
   @cgivre, this pull request introduces unit tests failure:
   ```
   [ERROR] 
org.apache.drill.exec.udfs.TestTimeBucketFunction.testTimeBucketTimestamp  Time 
elapsed: 2.584 s  <<< ERROR!
   java.lang.Exception: 
   at position 0 column '`high`' mismatched values, expected: 
2020-03-27T01:30(LocalDateTime) but received 2020-03-26T18:30(LocalDateTime)
   
   Expected Records near verification failure:
   Record Number: 0 { `high` : 2020-03-27T01:30, }
   
   
   Actual Records near verification failure:
   Record Number: 0 { `high` : 2020-03-26T18:30, }
   
   For query: SELECT time_bucket(CAST(1585272833845 AS TIMESTAMP), 30) AS 
high FROM (values(1))
at 
org.apache.drill.exec.udfs.TestTimeBucketFunction.testTimeBucketTimestamp(TestTimeBucketFunction.java:116)
   Caused by: java.lang.Exception: 
   at position 0 column '`high`' mismatched values, expected: 
2020-03-27T01:30(LocalDateTime) but received 2020-03-26T18:30(LocalDateTime)
   
   Expected Records near verification failure:
   Record Number: 0 { `high` : 2020-03-27T01:30, }
   
   
   Actual Records near verification failure:
   Record Number: 0 { `high` : 2020-03-26T18:30, }
   
at 
org.apache.drill.exec.udfs.TestTimeBucketFunction.testTimeBucketTimestamp(TestTimeBucketFunction.java:116)
   Caused by: java.lang.Exception: at position 0 column '`high`' mismatched 
values, expected: 2020-03-27T01:30(LocalDateTime) but received 
2020-03-26T18:30(LocalDateTime)
at 
org.apache.drill.exec.udfs.TestTimeBucketFunction.testTimeBucketTimestamp(TestTimeBucketFunction.java:116)
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-03-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068681#comment-17068681
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

cgivre commented on issue #2040: DRILL-7668: Allow Time Bucket Function to 
Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#issuecomment-604990458
 
 
   @paul-rogers Thank you very much for the review. 
   
   @arina-ielchiieva  Commits are squashed.  We should be ready to commit. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-03-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068653#comment-17068653
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

arina-ielchiieva commented on issue #2040: DRILL-7668: Allow Time Bucket 
Function to Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#issuecomment-604981406
 
 
   @cgivre please squash the commits.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-03-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068192#comment-17068192
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

cgivre commented on pull request #2040: DRILL-7668: Allow Time Bucket Function 
to Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#discussion_r398987931
 
 

 ##
 File path: 
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/TimeBucketFunctions.java
 ##
 @@ -97,9 +99,85 @@ public void eval() {
   long timestamp = inputDate.value;
 
   // Get the interval in milliseconds
-  long intervalToAdd = interval.value;
+  long groupByInterval = interval.value;
 
-  out.value = timestamp - (timestamp % intervalToAdd);
+  out.value = timestamp - (timestamp % groupByInterval);
+}
+  }
+
+  /**
+   * This function is used for facilitating time series analysis by creating 
buckets of time intervals.  See
+   * 
https://blog.timescale.com/blog/simplified-time-series-analytics-using-the-time_bucket-function/
 for usage. The function takes two arguments:
+   * 1. The timestamp (as a Drill timestamp)
+   * 2. The desired bucket interval IN milliseconds
+   *
+   * The function returns a BIGINT of the nearest time bucket.
+   */
+  @FunctionTemplate(name = "time_bucket",
+scope = FunctionTemplate.FunctionScope.SIMPLE,
+nulls = FunctionTemplate.NullHandling.NULL_IF_NULL)
+  public static class TimestampTimeBucketFunction implements DrillSimpleFunc {
+
+@Param
+TimeStampHolder inputDate;
+
+@Param
+BigIntHolder interval;
+
+@Output
+BigIntHolder out;
 
 Review comment:
   Fixed
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-03-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068186#comment-17068186
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

cgivre commented on pull request #2040: DRILL-7668: Allow Time Bucket Function 
to Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#discussion_r398982654
 
 

 ##
 File path: 
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/TimeBucketFunctions.java
 ##
 @@ -97,9 +99,85 @@ public void eval() {
   long timestamp = inputDate.value;
 
   // Get the interval in milliseconds
-  long intervalToAdd = interval.value;
+  long groupByInterval = interval.value;
 
-  out.value = timestamp - (timestamp % intervalToAdd);
+  out.value = timestamp - (timestamp % groupByInterval);
+}
+  }
+
+  /**
+   * This function is used for facilitating time series analysis by creating 
buckets of time intervals.  See
+   * 
https://blog.timescale.com/blog/simplified-time-series-analytics-using-the-time_bucket-function/
 for usage. The function takes two arguments:
+   * 1. The timestamp (as a Drill timestamp)
+   * 2. The desired bucket interval IN milliseconds
+   *
+   * The function returns a BIGINT of the nearest time bucket.
+   */
+  @FunctionTemplate(name = "time_bucket",
+scope = FunctionTemplate.FunctionScope.SIMPLE,
+nulls = FunctionTemplate.NullHandling.NULL_IF_NULL)
+  public static class TimestampTimeBucketFunction implements DrillSimpleFunc {
+
+@Param
+TimeStampHolder inputDate;
+
+@Param
+BigIntHolder interval;
+
+@Output
+BigIntHolder out;
 
 Review comment:
   I was thinking the same thing actually... of course right after I submitted. 
 I'll convert to TS.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-03-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068136#comment-17068136
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

paul-rogers commented on pull request #2040: DRILL-7668: Allow Time Bucket 
Function to Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#discussion_r398954740
 
 

 ##
 File path: 
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/TimeBucketFunctions.java
 ##
 @@ -97,9 +99,85 @@ public void eval() {
   long timestamp = inputDate.value;
 
   // Get the interval in milliseconds
-  long intervalToAdd = interval.value;
+  long groupByInterval = interval.value;
 
-  out.value = timestamp - (timestamp % intervalToAdd);
+  out.value = timestamp - (timestamp % groupByInterval);
+}
+  }
+
+  /**
+   * This function is used for facilitating time series analysis by creating 
buckets of time intervals.  See
+   * 
https://blog.timescale.com/blog/simplified-time-series-analytics-using-the-time_bucket-function/
 for usage. The function takes two arguments:
+   * 1. The timestamp (as a Drill timestamp)
+   * 2. The desired bucket interval IN milliseconds
+   *
+   * The function returns a BIGINT of the nearest time bucket.
+   */
+  @FunctionTemplate(name = "time_bucket",
+scope = FunctionTemplate.FunctionScope.SIMPLE,
+nulls = FunctionTemplate.NullHandling.NULL_IF_NULL)
+  public static class TimestampTimeBucketFunction implements DrillSimpleFunc {
+
+@Param
+TimeStampHolder inputDate;
+
+@Param
+BigIntHolder interval;
+
+@Output
+BigIntHolder out;
 
 Review comment:
   Sorry; I wonder if the `TimeStamp` version should emit a timestamp. The 
interval is a valid ms-since-the-epoch number; seems to make sense to truncate 
a TS to another TS.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-03-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067926#comment-17067926
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

cgivre commented on pull request #2040: DRILL-7668: Allow Time Bucket Function 
to Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#discussion_r398790952
 
 

 ##
 File path: 
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/TimeBucketFunctions.java
 ##
 @@ -102,4 +104,80 @@ public void eval() {
   out.value = timestamp - (timestamp % intervalToAdd);
 }
   }
+
+  /**
+   * This function is used for facilitating time series analysis by creating 
buckets of time intervals.  See
+   * 
https://blog.timescale.com/blog/simplified-time-series-analytics-using-the-time_bucket-function/
 for usage. The function takes two arguments:
+   * 1. The timestamp (as a Drill timestamp)
+   * 2. The desired bucket interval IN milliseconds
+   *
+   * The function returns a BIGINT of the nearest time bucket.
+   */
+  @FunctionTemplate(name = "time_bucket",
+scope = FunctionTemplate.FunctionScope.SIMPLE,
+nulls = FunctionTemplate.NullHandling.NULL_IF_NULL)
+  public static class TimestampTimeBucketFunction implements DrillSimpleFunc {
+
+@Param
+TimeStampHolder inputDate;
+
+@Param
+BigIntHolder interval;
+
+@Output
+BigIntHolder out;
+
+@Override
+public void setup() {
+}
+
+@Override
+public void eval() {
+  // Get the timestamp in milliseconds
+  long timestamp = inputDate.value;
+
+  // Get the interval in milliseconds
+  long intervalToAdd = interval.value;
+
+  out.value = timestamp - (timestamp % intervalToAdd);
+}
+  }
+
+  /**
+   * This function is used for facilitating time series analysis by creating 
buckets of time intervals.  See
+   * 
https://blog.timescale.com/blog/simplified-time-series-analytics-using-the-time_bucket-function/
 for usage. The function takes two arguments:
+   * 1. The timestamp (as a Drill timestamp)
+   * 2. The desired bucket interval IN milliseconds
+   *
+   * The function returns a BIGINT of the nearest time bucket.
+   */
+  @FunctionTemplate(name = "time_bucket",
+scope = FunctionTemplate.FunctionScope.SIMPLE,
+nulls = FunctionTemplate.NullHandling.NULL_IF_NULL)
+  public static class DoubleTimeBucketFunction implements DrillSimpleFunc {
+
+@Param
+Float8Holder inputDate;
+
+@Param
+BigIntHolder interval;
+
+@Output
+BigIntHolder out;
+
+@Override
+public void setup() {
+}
+
+@Override
+public void eval() {
+  // Get the timestamp in milliseconds
+  long timestamp = (long)inputDate.value;
 
 Review comment:
   Fixed
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-03-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067928#comment-17067928
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

cgivre commented on pull request #2040: DRILL-7668: Allow Time Bucket Function 
to Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#discussion_r398791038
 
 

 ##
 File path: 
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/TimeBucketFunctions.java
 ##
 @@ -102,4 +104,80 @@ public void eval() {
   out.value = timestamp - (timestamp % intervalToAdd);
 }
   }
+
+  /**
+   * This function is used for facilitating time series analysis by creating 
buckets of time intervals.  See
+   * 
https://blog.timescale.com/blog/simplified-time-series-analytics-using-the-time_bucket-function/
 for usage. The function takes two arguments:
+   * 1. The timestamp (as a Drill timestamp)
+   * 2. The desired bucket interval IN milliseconds
+   *
+   * The function returns a BIGINT of the nearest time bucket.
+   */
+  @FunctionTemplate(name = "time_bucket",
+scope = FunctionTemplate.FunctionScope.SIMPLE,
+nulls = FunctionTemplate.NullHandling.NULL_IF_NULL)
+  public static class TimestampTimeBucketFunction implements DrillSimpleFunc {
+
+@Param
+TimeStampHolder inputDate;
+
+@Param
+BigIntHolder interval;
+
+@Output
+BigIntHolder out;
+
+@Override
+public void setup() {
+}
+
+@Override
+public void eval() {
+  // Get the timestamp in milliseconds
+  long timestamp = inputDate.value;
+
+  // Get the interval in milliseconds
+  long intervalToAdd = interval.value;
 
 Review comment:
   Fixed
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-03-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067842#comment-17067842
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

paul-rogers commented on pull request #2040: DRILL-7668: Allow Time Bucket 
Function to Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#discussion_r398731150
 
 

 ##
 File path: 
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/TimeBucketFunctions.java
 ##
 @@ -102,4 +104,80 @@ public void eval() {
   out.value = timestamp - (timestamp % intervalToAdd);
 }
   }
+
+  /**
+   * This function is used for facilitating time series analysis by creating 
buckets of time intervals.  See
+   * 
https://blog.timescale.com/blog/simplified-time-series-analytics-using-the-time_bucket-function/
 for usage. The function takes two arguments:
+   * 1. The timestamp (as a Drill timestamp)
+   * 2. The desired bucket interval IN milliseconds
+   *
+   * The function returns a BIGINT of the nearest time bucket.
+   */
+  @FunctionTemplate(name = "time_bucket",
+scope = FunctionTemplate.FunctionScope.SIMPLE,
+nulls = FunctionTemplate.NullHandling.NULL_IF_NULL)
+  public static class TimestampTimeBucketFunction implements DrillSimpleFunc {
+
+@Param
+TimeStampHolder inputDate;
+
+@Param
+BigIntHolder interval;
+
+@Output
+BigIntHolder out;
+
+@Override
+public void setup() {
+}
+
+@Override
+public void eval() {
+  // Get the timestamp in milliseconds
+  long timestamp = inputDate.value;
+
+  // Get the interval in milliseconds
+  long intervalToAdd = interval.value;
 
 Review comment:
   This is not an interval to "add" so much as a group-by interval.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-03-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067841#comment-17067841
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

paul-rogers commented on pull request #2040: DRILL-7668: Allow Time Bucket 
Function to Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040#discussion_r398732279
 
 

 ##
 File path: 
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/TimeBucketFunctions.java
 ##
 @@ -102,4 +104,80 @@ public void eval() {
   out.value = timestamp - (timestamp % intervalToAdd);
 }
   }
+
+  /**
+   * This function is used for facilitating time series analysis by creating 
buckets of time intervals.  See
+   * 
https://blog.timescale.com/blog/simplified-time-series-analytics-using-the-time_bucket-function/
 for usage. The function takes two arguments:
+   * 1. The timestamp (as a Drill timestamp)
+   * 2. The desired bucket interval IN milliseconds
+   *
+   * The function returns a BIGINT of the nearest time bucket.
+   */
+  @FunctionTemplate(name = "time_bucket",
+scope = FunctionTemplate.FunctionScope.SIMPLE,
+nulls = FunctionTemplate.NullHandling.NULL_IF_NULL)
+  public static class TimestampTimeBucketFunction implements DrillSimpleFunc {
+
+@Param
+TimeStampHolder inputDate;
+
+@Param
+BigIntHolder interval;
+
+@Output
+BigIntHolder out;
+
+@Override
+public void setup() {
+}
+
+@Override
+public void eval() {
+  // Get the timestamp in milliseconds
+  long timestamp = inputDate.value;
+
+  // Get the interval in milliseconds
+  long intervalToAdd = interval.value;
+
+  out.value = timestamp - (timestamp % intervalToAdd);
+}
+  }
+
+  /**
+   * This function is used for facilitating time series analysis by creating 
buckets of time intervals.  See
+   * 
https://blog.timescale.com/blog/simplified-time-series-analytics-using-the-time_bucket-function/
 for usage. The function takes two arguments:
+   * 1. The timestamp (as a Drill timestamp)
+   * 2. The desired bucket interval IN milliseconds
+   *
+   * The function returns a BIGINT of the nearest time bucket.
+   */
+  @FunctionTemplate(name = "time_bucket",
+scope = FunctionTemplate.FunctionScope.SIMPLE,
+nulls = FunctionTemplate.NullHandling.NULL_IF_NULL)
+  public static class DoubleTimeBucketFunction implements DrillSimpleFunc {
+
+@Param
+Float8Holder inputDate;
+
+@Param
+BigIntHolder interval;
+
+@Output
+BigIntHolder out;
+
+@Override
+public void setup() {
+}
+
+@Override
+public void eval() {
+  // Get the timestamp in milliseconds
+  long timestamp = (long)inputDate.value;
 
 Review comment:
   `Math.round(inputDate.value)`. We would want 4. to be treated as 
5.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7668) Allow Time Bucket Function to Accept Floats and Timestamps

2020-03-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067688#comment-17067688
 ] 

ASF GitHub Bot commented on DRILL-7668:
---

cgivre commented on pull request #2040: DRILL-7668: Allow Time Bucket Function 
to Accept Floats and Timestamps
URL: https://github.com/apache/drill/pull/2040
 
 
   # [DRILL-7668](https://issues.apache.org/jira/browse/DRILL-7668): Allow Time 
Bucket Function to Accept Floats and Timestamps
   
   ## Description
   
   Drill has a function `time_bucket()` which facilitates time series analysis. 
This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
Floats are typically not used for timestamps, however in the event that the 
data is coming from imperfect files, the numbers may be read as floats and 
hence require casting in queries.  This PR makes this easier. 
   
   ## Documentation
   `time_bucket()` function now will accept a `timestamp` or `float8` as an 
argument for the time.
   
   ## Testing
   Added two unit tests for the new data types.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow Time Bucket Function to Accept Floats and Timestamps
> --
>
> Key: DRILL-7668
> URL: https://issues.apache.org/jira/browse/DRILL-7668
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
> Fix For: 1.18.0
>
>
> Drill has a function `time_bucket()` which facilitates time series analysis. 
> This PR extends this function to accept `FLOAT8` and `TIMESTAMPS` as input.  
> Floats are typically not used for timestamps, however in the event that the 
> data is coming from imperfect files, the numbers may be read as floats and 
> hence require casting in queries.  This PR makes this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)