[
https://issues.apache.org/jira/browse/TAJO-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996213#comment-14996213
]
ASF GitHub Bot commented on TAJO-1925:
--------------------------------------
Github user blrunner commented on the pull request:
https://github.com/apache/tajo/pull/853#issuecomment-154998436
I fixed some bugs and added description about timezone with hive. For the
reference, I tested the patch with tpc-h 1g dataset as following:
* on Hive
```
hive> CREATE TABLE timeline(event string) PARTITIONED BY (happened
timestamp);
hive> insert overwrite table timeline partition(happened=1446608033000)
select n_name from nation where n_nationkey = 0;
hive> insert overwrite table timeline partition(happened=1446608037000)
select n_name from nation where n_nationkey = 1;
hive> insert overwrite table timeline partition(happened=854419808037)
select n_name from nation where n_nationkey = 2;
hive> insert overwrite table timeline partition(happened=1446712555983)
select n_name from nation where n_nationkey = 3;
hive> select * from timeline;
OK
BRAZIL 1997-01-28 11:50:08.037
ALGERIA 2015-11-04 12:33:53
ARGENTINA 2015-11-04 12:33:57
CANADA 2015-11-05 17:35:55.983
```
* on Tajo with HiveCatalogStore
```
default> SET TIME ZONE 'Asia/Seoul';
default> select * from timeline;
event, happened
-------------------------------
BRAZIL, 1997-01-28 11:50:08.037
ALGERIA, 2015-11-04 12:33:53
ARGENTINA, 2015-11-04 12:33:57
CANADA, 2015-11-05 17:35:55.983
(4 rows, 0.092 sec, 0 B selected)
default> select * from timeline where happened = TIMESTAMP '2015-11-04
12:33:57';
event, happened
-------------------------------
ARGENTINA, 2015-11-04 12:33:57
(1 rows, 0.126 sec, 0 B selected)
default> insert into timeline select n_name, TIMESTAMP '2015-11-06
14:45:48.066' from nation where n_nationkey = 10;
default> insert into timeline select n_name, TIMESTAMP '2015-11-06
14:45:48.75' from nation where n_nationkey = 11;
default> select * from timeline;
event, happened
-------------------------------
BRAZIL, 1997-01-28 11:50:08.037
ALGERIA, 2015-11-04 12:33:53
ARGENTINA, 2015-11-04 12:33:57
CANADA, 2015-11-05 17:35:55.983
IRAN, 2015-11-06 14:45:48.066
IRAQ, 2015-11-06 14:45:48.75
(6 rows, 0.087 sec, 0 B selected)
```
* on Tajo with MySQLStore
```
default> SET TIME ZONE 'Asia/Seoul';
default> CREATE EXTERNAL TABLE default.timeline (event TEXT) USING TEXT
PARTITION BY COLUMN(happened TIMESTAMP) LOCATION
'hdfs://localhost:9010/user/hive/warehouse/timeline';
default> select * from timeline where happened = TIMESTAMP '1997-01-28
11:50:08.037';
event, happened
-------------------------------
BRAZIL, 1997-01-28 11:50:08.037
(1 rows, 0.074 sec, 0 B selected)
default> select * from timeline where happened = TIMESTAMP '2015-11-04
12:33:57';
event, happened
-------------------------------
ARGENTINA, 2015-11-04 12:33:57
(1 rows, 0.033 sec, 0 B selected)
default> alter table timeline repair partition;
default> insert into timeline select n_name, TIMESTAMP '2015-11-09
16:47:45.012' from nation where n_nationkey = 24;
default> select * from timeline;
event, happened
-------------------------------
BRAZIL, 1997-01-28 11:50:08.037
ALGERIA, 2015-11-04 12:33:53
ARGENTINA, 2015-11-04 12:33:57
CANADA, 2015-11-05 17:35:55.983
IRAN, 2015-11-06 14:45:48.066
IRAQ, 2015-11-06 14:45:48.75
UNITED STATES, 2015-11-09 16:47:45.012
(7 rows, 0.036 sec, 0 B selected)
default> select * from timeline where happened = TIMESTAMP '2015-11-09
16:47:45.012';
event, happened
-------------------------------
UNITED STATES, 2015-11-09 16:47:45.012
```
> Improve hive compatibility with TIMESTAMP partition column.
> -----------------------------------------------------------
>
> Key: TAJO-1925
> URL: https://issues.apache.org/jira/browse/TAJO-1925
> Project: Tajo
> Issue Type: Improvement
> Components: Catalog, Physical Operator
> Affects Versions: 0.11.0, 0.12.0
> Reporter: Jaehwa Jung
> Assignee: Jaehwa Jung
> Fix For: 0.12.0, 0.11.1
>
> Attachments: TAJO-1925.patch
>
>
> -Currently, Tajo allow to use TIMESTAMP column as a partition key. But if
> users use TIMESTAMP partition column, Tajo doesn't keep the original
> TIMESTAMP values. Actually, Tajo automatically converts TIMESTAMP values to
> STRING literals which are accepted in the format YYYY-MM-DD HH:MM:SS. As a
> result, Tajo can't keep mills of second and can't provide right hive
> compatibility-
> We need to support hive compatibility for all partition column types. But
> when using TIMESTAMP partition column, users might not get correct partitions
> occasionally because tajo partition name is different from hive partition
> name. Actually, Tajo automatically converts TIMESTAMP values to STRING
> literals which are accepted in the format YYYY-MM-DD HH:MM:SS. But Hive
> automatically converts it to STRING literals which are accepted in the format
> YYYY-MM-DD HH:MM:SS.MS.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)