[jira] [Updated] (DRILL-4345) Hive Native Reader reporting wrong results for timestamp column in hive generated parquet file

Rahul Challapalli (JIRA) Tue, 02 Feb 2016 16:18:58 -0800

     [ 
https://issues.apache.org/jira/browse/DRILL-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Rahul Challapalli updated DRILL-4345:
-------------------------------------
    Attachment: hive1_fewtypes_null.parquet

> Hive Native Reader reporting wrong results for timestamp column in hive 
> generated parquet file
> ----------------------------------------------------------------------------------------------
>
>                 Key: DRILL-4345
>                 URL: https://issues.apache.org/jira/browse/DRILL-4345
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Hive, Storage - Parquet
>            Reporter: Rahul Challapalli
>            Priority: Critical
>         Attachments: hive1_fewtypes_null.parquet
>
>
> git.commit.id.abbrev=1b96174
> Below you can see different results returned from hive plugin and native 
> reader for the same table.
> {code}
> 0: jdbc:drill:zk=10.10.100.190:5181> use hive;
> +-------+-----------------------------------+
> |  ok   |              summary              |
> +-------+-----------------------------------+
> | true  | Default schema changed to [hive]  |
> +-------+-----------------------------------+
> 1 row selected (0.415 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> select int_col, timestamp_col from 
> hive1_fewtypes_null_parquet;
> +----------+------------------------+
> | int_col  |     timestamp_col      |
> +----------+------------------------+
> | 1        | null                   |
> | null     | 1997-01-02 00:00:00.0  |
> | 3        | null                   |
> | 4        | null                   |
> | 5        | 1997-02-10 17:32:00.0  |
> | 6        | 1997-02-11 17:32:01.0  |
> | 7        | 1997-02-12 17:32:01.0  |
> | 8        | 1997-02-13 17:32:01.0  |
> | 9        | null                   |
> | 10       | 1997-02-15 17:32:01.0  |
> | null     | 1997-02-16 17:32:01.0  |
> | 12       | 1897-02-18 17:32:01.0  |
> | 13       | 2002-02-14 17:32:01.0  |
> | 14       | 1991-02-10 17:32:01.0  |
> | 15       | 1900-02-16 17:32:01.0  |
> | 16       | null                   |
> | null     | 1897-02-16 17:32:01.0  |
> | 18       | 1997-02-16 17:32:01.0  |
> | null     | null                   |
> | 20       | 1996-02-28 17:32:01.0  |
> | null     | null                   |
> +----------+------------------------+
> 21 rows selected (0.368 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> alter session set 
> `store.hive.optimize_scan_with_native_readers` = true;
> +-------+--------------------------------------------------------+
> |  ok   |                        summary                         |
> +-------+--------------------------------------------------------+
> | true  | store.hive.optimize_scan_with_native_readers updated.  |
> +-------+--------------------------------------------------------+
> 1 row selected (0.213 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> select int_col, timestamp_col from 
> hive1_fewtypes_null_parquet;
> +----------+------------------------+
> | int_col  |     timestamp_col      |
> +----------+------------------------+
> | 1        | null                   |
> | null     | 1997-01-02 00:00:00.0  |
> | 3        | 1997-02-10 17:32:00.0  |
> | 4        | null                   |
> | 5        | 1997-02-11 17:32:01.0  |
> | 6        | 1997-02-12 17:32:01.0  |
> | 7        | 1997-02-13 17:32:01.0  |
> | 8        | 1997-02-15 17:32:01.0  |
> | 9        | 1997-02-16 17:32:01.0  |
> | 10       | 1900-02-16 17:32:01.0  |
> | null     | 1897-02-16 17:32:01.0  |
> | 12       | 1997-02-16 17:32:01.0  |
> | 13       | 1996-02-28 17:32:01.0  |
> | 14       | 1997-01-02 00:00:00.0  |
> | 15       | 1997-01-02 00:00:00.0  |
> | 16       | 1997-01-02 00:00:00.0  |
> | null     | 1997-01-02 00:00:00.0  |
> | 18       | 1997-01-02 00:00:00.0  |
> | null     | 1997-01-02 00:00:00.0  |
> | 20       | 1997-01-02 00:00:00.0  |
> | null     | 1997-01-02 00:00:00.0  |
> +----------+------------------------+
> 21 rows selected (0.352 seconds)
> {code}
> DDL for hive table :
> {code}
> create external table hive1_fewtypes_null_parquet (
>       int_col int,
>       bigint_col bigint,
>       date_col string,
>       time_col string,
>       timestamp_col timestamp,
>       interval_col string,
>       varchar_col string,
>       float_col float,
>       double_col double,
>       bool_col boolean
>     )
> stored as parquet
> location '/drill/testdata/hive_storage/hive1_fewtypes_null';
> {code}
> Attached the underlying parquet file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-4345) Hive Native Reader reporting wrong results for timestamp column in hive generated parquet file

Reply via email to