[ 
https://issues.apache.org/jira/browse/TAJO-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13851968#comment-13851968
 ] 

Min Zhou commented on TAJO-283:
-------------------------------

Great! thanks for the information.

I was considering about the small hdfs files issue if we won't do a merge 
through shuffle. The file number should be M * R,  where M is the mapper tasks 
number and R is the reducer tasks number.  If data shuffling is added, files 
numbers would drop into R.


> Add Table Partitioning
> ----------------------
>
>                 Key: TAJO-283
>                 URL: https://issues.apache.org/jira/browse/TAJO-283
>             Project: Tajo
>          Issue Type: New Feature
>          Components: catalog, physical operator, planner/optimizer
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>             Fix For: 0.8-incubating
>
>
> Table partitioning gives many facilities to maintain large tables. First of 
> all, it enables the data management system to prune many input data which are 
> actually not necessary. In addition, it gives the system more optimization  
> opportunities  that exploit the physical layouts.
> Basically, Tajo should follow the RDBMS-style partitioning system, including 
> range, list, hash, and so on. In order to keep Hive compatibility, we need to 
> add Hive partition type that does not exists in existing DBMS systems.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to