[ 
https://issues.apache.org/jira/browse/TAJO-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13851603#comment-13851603
 ] 

Hyunsik Choi commented on TAJO-283:
-----------------------------------

The result in staging dir is finally moved to a specified output directory. 
Usually, the output is moved to warehouse dir (e.g., /tajo/warehouse/xxxx).

In TAJO-329, Jaehwa implemented a table partition executor for column 
partitioned table. Interestingly, TAJO-329 works correctly without no shuffle. 
However, this way will create too many output files equivalent to the number of 
HDFS blocks. It is not fit for HDFS's characteristics. 

So, I'm going to modify a distributed planner to allow a partitioned table 
store operator to have a proper shuffle method. For example, hash shuffle is 
good for column, list, and hash partition types, and range shuffle is good for 
range partition. In some special case, table partitions without shuffle may be 
useful after TAJO-385, which merges a number of fragments into fewer fragments.

Thanks!

> Add Table Partitioning
> ----------------------
>
>                 Key: TAJO-283
>                 URL: https://issues.apache.org/jira/browse/TAJO-283
>             Project: Tajo
>          Issue Type: New Feature
>          Components: catalog, physical operator, planner/optimizer
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>             Fix For: 0.8-incubating
>
>
> Table partitioning gives many facilities to maintain large tables. First of 
> all, it enables the data management system to prune many input data which are 
> actually not necessary. In addition, it gives the system more optimization  
> opportunities  that exploit the physical layouts.
> Basically, Tajo should follow the RDBMS-style partitioning system, including 
> range, list, hash, and so on. In order to keep Hive compatibility, we need to 
> add Hive partition type that does not exists in existing DBMS systems.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to