[ 
https://issues.apache.org/jira/browse/TAJO-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996213#comment-14996213
 ] 

ASF GitHub Bot commented on TAJO-1925:
--------------------------------------

Github user blrunner commented on the pull request:

    https://github.com/apache/tajo/pull/853#issuecomment-154998436
  
    I fixed some bugs and added description about timezone with hive. For the 
reference, I tested the patch with tpc-h 1g dataset as following:
    
    * on Hive
    ```
    hive> CREATE TABLE timeline(event string) PARTITIONED BY (happened 
timestamp);
    
    hive> insert overwrite table timeline partition(happened=1446608033000) 
select n_name from nation where n_nationkey = 0;
    
    hive> insert overwrite table timeline partition(happened=1446608037000) 
select n_name from nation where n_nationkey = 1;
    
    hive> insert overwrite table timeline partition(happened=854419808037) 
select n_name from nation where n_nationkey = 2;
    
    hive> insert overwrite table timeline partition(happened=1446712555983) 
select n_name from nation where n_nationkey = 3;
    
    hive> select * from timeline;
    OK
    BRAZIL      1997-01-28 11:50:08.037
    ALGERIA     2015-11-04 12:33:53
    ARGENTINA   2015-11-04 12:33:57
    CANADA      2015-11-05 17:35:55.983
    ```
    
    * on Tajo with HiveCatalogStore
    ```
    default> SET TIME ZONE 'Asia/Seoul';
    
    default> select * from timeline;
    event,  happened
    -------------------------------
    BRAZIL,  1997-01-28 11:50:08.037
    ALGERIA,  2015-11-04 12:33:53
    ARGENTINA,  2015-11-04 12:33:57
    CANADA,  2015-11-05 17:35:55.983
    (4 rows, 0.092 sec, 0 B selected)
    
    default> select * from timeline where happened = TIMESTAMP '2015-11-04 
12:33:57';
    event,  happened
    -------------------------------
    ARGENTINA,  2015-11-04 12:33:57
    (1 rows, 0.126 sec, 0 B selected)
    
    default> insert into timeline select n_name, TIMESTAMP '2015-11-06 
14:45:48.066' from nation where n_nationkey = 10;
    default> insert into timeline select n_name, TIMESTAMP '2015-11-06 
14:45:48.75' from nation where n_nationkey = 11;
    
    default> select * from timeline;
    event,  happened
    -------------------------------
    BRAZIL,  1997-01-28 11:50:08.037
    ALGERIA,  2015-11-04 12:33:53
    ARGENTINA,  2015-11-04 12:33:57
    CANADA,  2015-11-05 17:35:55.983
    IRAN,  2015-11-06 14:45:48.066
    IRAQ,  2015-11-06 14:45:48.75
    (6 rows, 0.087 sec, 0 B selected)
    ```
    
    * on Tajo with MySQLStore
    ```
    default> SET TIME ZONE 'Asia/Seoul';
    
    default> CREATE EXTERNAL TABLE default.timeline (event TEXT) USING TEXT
    PARTITION BY COLUMN(happened TIMESTAMP) LOCATION 
'hdfs://localhost:9010/user/hive/warehouse/timeline';
    
    default> select * from timeline where happened = TIMESTAMP '1997-01-28 
11:50:08.037';
    event,  happened
    -------------------------------
    BRAZIL,  1997-01-28 11:50:08.037
    (1 rows, 0.074 sec, 0 B selected)
    
    default> select * from timeline where happened = TIMESTAMP '2015-11-04 
12:33:57';
    event,  happened
    -------------------------------
    ARGENTINA,  2015-11-04 12:33:57
    (1 rows, 0.033 sec, 0 B selected)
    
    default> alter table timeline repair partition;
    
    default> insert into timeline select n_name, TIMESTAMP '2015-11-09 
16:47:45.012' from nation where n_nationkey = 24;
    
    default> select * from timeline;
    event,  happened
    -------------------------------
    BRAZIL,  1997-01-28 11:50:08.037
    ALGERIA,  2015-11-04 12:33:53
    ARGENTINA,  2015-11-04 12:33:57
    CANADA,  2015-11-05 17:35:55.983
    IRAN,  2015-11-06 14:45:48.066
    IRAQ,  2015-11-06 14:45:48.75
    UNITED STATES,  2015-11-09 16:47:45.012
    (7 rows, 0.036 sec, 0 B selected)
    
    default> select * from timeline where happened = TIMESTAMP '2015-11-09 
16:47:45.012';
    event,  happened
    -------------------------------
    UNITED STATES,  2015-11-09 16:47:45.012
    ```



> Improve hive compatibility with TIMESTAMP partition column.
> -----------------------------------------------------------
>
>                 Key: TAJO-1925
>                 URL: https://issues.apache.org/jira/browse/TAJO-1925
>             Project: Tajo
>          Issue Type: Improvement
>          Components: Catalog, Physical Operator
>    Affects Versions: 0.11.0, 0.12.0
>            Reporter: Jaehwa Jung
>            Assignee: Jaehwa Jung
>             Fix For: 0.12.0, 0.11.1
>
>         Attachments: TAJO-1925.patch
>
>
> -Currently, Tajo allow to use TIMESTAMP column as a partition key. But if 
> users use TIMESTAMP partition column, Tajo doesn't keep the original 
> TIMESTAMP values. Actually, Tajo automatically converts TIMESTAMP values to 
> STRING literals which are accepted in the format YYYY-MM-DD HH:MM:SS. As a 
> result, Tajo can't keep mills of second and can't provide right hive 
> compatibility-
> We need to support hive compatibility for all partition column types. But 
> when using TIMESTAMP partition column, users might not get correct partitions 
> occasionally because tajo partition name is different from hive partition 
> name. Actually, Tajo automatically converts TIMESTAMP values to STRING 
> literals which are accepted in the format YYYY-MM-DD HH:MM:SS. But Hive 
> automatically converts it to STRING literals which are accepted in the format 
> YYYY-MM-DD HH:MM:SS.MS. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to