[ 
https://issues.apache.org/jira/browse/DRILL-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15755268#comment-15755268
 ] 

Rahul Challapalli commented on DRILL-5002:
------------------------------------------

>From the queries you ran, all the timestamp functions you used are overwritten 
>by drill. So I expect them to work. So I tried 'months_between' which I 
>couldn't find in the drill repo. Even with that function, I got the right 
>results.
{code}
select months_between(timestamp_col, '1996-02-29 09:32:01.0') from 
dfs.`/drill/testdata/cross-sources/fewtypes.parquet`;
+-----------------+
|     EXPR$0      |
+-----------------+
| 10.10958931     |
| 10.10546558     |
| 66.7765976      |
| 11.38709677     |
| 11.3870964      |
| 11.41935484     |
| 11.4516129      |
| 11.48387097     |
| 11.51612903     |
| 11.5483871      |
| 11.58064516     |
| 227.64516129    |
| 179.51612903    |
| -1032.61290323  |
| -468.41935484   |
| 35.58064516     |
| -1188.41935484  |
| 11.58064516     |
| 1211.58064516   |
| -0.03225806     |
| 0.0             |
+-----------------+
21 rows selected (0.401 seconds)
{code}

Also what is the reason for using 'alter session set 
store.parquet.reader.int96_as_timestamp = true;' before you ran your queries? 
If this is required for any specific functions we should document them.

Apart from the above comment, I think we can close this jira. Thanks for 
looking into it [~vitalii]

> Using hive's date functions on top of date column in parquet gives wrong 
> results
> --------------------------------------------------------------------------------
>
>                 Key: DRILL-5002
>                 URL: https://issues.apache.org/jira/browse/DRILL-5002
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Hive, Storage - Parquet
>            Reporter: Rahul Challapalli
>            Assignee: Vitalii Diravka
>            Priority: Critical
>
> git.commit.id.abbrev=190d5d4
> Wrong Result 1 :
> {code}
> select l_shipdate, `month`(l_shipdate) from cp.`tpch/lineitem.parquet` where 
> l_shipdate = date '1994-02-01' limit 2;
> +-------------+---------+
> | l_shipdate  | EXPR$1  |
> +-------------+---------+
> | 1994-02-01  | 1       |
> | 1994-02-01  | 1       |
> +-------------+---------+
> {code}
> Wrong Result 2 : 
> {code}
> select l_shipdate, `day`(l_shipdate) from cp.`tpch/lineitem.parquet` where 
> l_shipdate = date '1998-06-02' limit 2;
> +-------------+---------+
> | l_shipdate  | EXPR$1  |
> +-------------+---------+
> | 1998-06-02  | 1       |
> | 1998-06-02  | 1       |
> +-------------+---------+
> {code}
> Correct Result :
> {code}
> select l_shipdate, `month`(l_shipdate) from cp.`tpch/lineitem.parquet` where 
> l_shipdate = date '1998-06-02' limit 2;
> +-------------+---------+
> | l_shipdate  | EXPR$1  |
> +-------------+---------+
> | 1998-06-02  | 6       |
> | 1998-06-02  | 6       |
> +-------------+---------+
> {code}
> It looks like we are getting wrong results when the 'day' is '01'. I only 
> tried month and day hive functions....but wouldn't be surprised if they have 
> similar issues too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to