handmadecode opened a new issue, #2788:
URL: https://github.com/apache/drill/issues/2788

   **Describe the bug**
   We are using Drill with parquet files where the timestamp columns are in 
microseconds. When those columns are displayed, Drill converts the microsecond 
values to milliseconds. However, when using a timestamp column in WHERE clauses 
it looks like the original microsecond value is used instead of the adjusted 
millisecond value when filtering records.
   
   **To Reproduce**
   Assume a parquet file a directory "Test" with a column _timestampCol_ having 
the type `org.apache.parquet.schema.OriginalType.TIMESTAMP_MICROS`.
   
   Assume there are two records with the values 1673981999806149 and 
1674759597743552, respectively, in that column (i.e. the UTC dates 
2023-01-17T18:59:59.806149 and 2023-01-26T18:59:57.743552)
   
   1. Execute the query
   `SELECT timestampCol FROM dfs.Test;`
   The result includes both records, as expected.
   
   2. Execute the query
   `SELECT timestampCol FROM dfs.Test WHERE timestampCol < 
TO_TIMESTAMP('2023-02-01 00:00:00', 'yyyy-MM-dd HH:mm:ss')`
   This produces an empty result although both records have a value less than 
the argument.
   
   3. Execute
   `SELECT timestampCol FROM dfs.Test WHERE timestampCol > 
TO_TIMESTAMP('2023-02-01 00:00:00', 'yyyy-MM-dd HH:mm:ss')`
   The result includes both records although neither have a value greater than 
the argument.
   
   **Expected behavior**
   The query in 2) above should produce a result with both records, and the 
query in 3) should produce an empty result. 
   
   **Error detail, log output or screenshots**
   No errors in the logs
   
   **Drill version**
   1.21.0 and master (as of commit 9c401c6).
   
   **Additional context**
   Even timestamps long into the future produce results with both records, e.g.:
   `SELECT timestampCol FROM dfs.Test WHERE timestampCol > 
TO_TIMESTAMP('2502-04-04 00:00:00', 'yyyy-MM-dd HH:mm:ss')`
   
   Manually converting the timestamp column to milliseconds produces the 
expected result:
   `SELECT timestampCol FROM dfs.Test WHERE 
TO_TIMESTAMP(CONVERT_FROM(CONVERT_TO(timestampCol, 'TIMESTAMP_EPOCH'), 
'BIGINT')/1000) < TO_TIMESTAMP('2023-02-01 00:00:00', 'yyyy-MM-dd HH:mm:ss')`
   produces a result with both records.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to