[
https://issues.apache.org/jira/browse/TAJO-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13851968#comment-13851968
]
Min Zhou commented on TAJO-283:
-------------------------------
Great! thanks for the information.
I was considering about the small hdfs files issue if we won't do a merge
through shuffle. The file number should be M * R, where M is the mapper tasks
number and R is the reducer tasks number. If data shuffling is added, files
numbers would drop into R.
> Add Table Partitioning
> ----------------------
>
> Key: TAJO-283
> URL: https://issues.apache.org/jira/browse/TAJO-283
> Project: Tajo
> Issue Type: New Feature
> Components: catalog, physical operator, planner/optimizer
> Reporter: Hyunsik Choi
> Assignee: Hyunsik Choi
> Fix For: 0.8-incubating
>
>
> Table partitioning gives many facilities to maintain large tables. First of
> all, it enables the data management system to prune many input data which are
> actually not necessary. In addition, it gives the system more optimization
> opportunities that exploit the physical layouts.
> Basically, Tajo should follow the RDBMS-style partitioning system, including
> range, list, hash, and so on. In order to keep Hive compatibility, we need to
> add Hive partition type that does not exists in existing DBMS systems.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)