Krystal created DRILL-5034:
------------------------------
Summary: Select timestamp from hive generated parquet always
return in UTC
Key: DRILL-5034
URL: https://issues.apache.org/jira/browse/DRILL-5034
Project: Apache Drill
Issue Type: Bug
Components: Storage - Parquet
Affects Versions: 1.9.0
Reporter: Krystal
commit id: 5cea9afa6278e21574c6a982ae5c3d82085ef904
Reading timestamp data against a hive parquet table from drill automatically
converts the timestamp data to UTC.
SELECT TIMEOFDAY() FROM (VALUES(1));
+----------------------------------------------+
| EXPR$0 |
+----------------------------------------------+
| 2016-11-10 12:33:26.547 America/Los_Angeles |
+----------------------------------------------+
data schema:
message hive_schema {
optional int32 voter_id;
optional binary name (UTF8);
optional int32 age;
optional binary registration (UTF8);
optional fixed_len_byte_array(3) contributions (DECIMAL(6,2));
optional int32 voterzone;
optional int96 create_timestamp;
optional int32 create_date (DATE);
}
Using drill-1.8, the returned timestamps match the table data:
select convert_from(create_timestamp, 'TIMESTAMP_IMPALA') from
`/user/hive/warehouse/voter_hive_parquet` limit 5;
+------------------------+
| EXPR$0 |
+------------------------+
| 2016-10-23 20:03:58.0 |
| null |
| 2016-09-09 12:01:18.0 |
| 2017-03-06 20:35:55.0 |
| 2017-01-20 22:32:43.0 |
+------------------------+
5 rows selected (1.032 seconds)
If the user timzone is changed to UTC, then the timestamp data is returned in
UTC time.
Using drill-1.9, the returned timestamps got converted to UTC eventhough the
user timezone is in PST.
select convert_from(create_timestamp, 'TIMESTAMP_IMPALA') from
dfs.`/user/hive/warehouse/voter_hive_parquet` limit 5;
+------------------------+
| EXPR$0 |
+------------------------+
| 2016-10-24 03:03:58.0 |
| null |
| 2016-09-09 19:01:18.0 |
| 2017-03-07 04:35:55.0 |
| 2017-01-21 06:32:43.0 |
+------------------------+
alter session set `store.parquet.reader.int96_as_timestamp`=true;
+-------+---------------------------------------------------+
| ok | summary |
+-------+---------------------------------------------------+
| true | store.parquet.reader.int96_as_timestamp updated. |
+-------+---------------------------------------------------+
select create_timestamp from dfs.`/user/hive/warehouse/voter_hive_parquet`
limit 5;
+------------------------+
| create_timestamp |
+------------------------+
| 2016-10-24 03:03:58.0 |
| null |
| 2016-09-09 19:01:18.0 |
| 2017-03-07 04:35:55.0 |
| 2017-01-21 06:32:43.0 |
+------------------------+
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)