[
https://issues.apache.org/jira/browse/TAJO-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyunsik Choi updated TAJO-744:
------------------------------
Description:
Currently, Tajo does not manage partitioned directly. In Tajo, each partition
is just a directory. For each query, a logical planner traverses matched
directories in HDFS according to partition predicates.
This approach is not efficient especially in the environment where the number
of partitions are very large. It also makes partition management hard.
Tajo should manage partitions directly by using ALTER TABLE ADD/DROP PARTITION
statements. A number of partition entries should be stored in the underlying
database that catalog uses.
{code:title=Synopsis of ALTER TABLE ADD/DROP PARTITION}
ALTER TABLE table_name [IF NOT EXISTS] ADD COLUMN PARTITION (key1 = 'val2',
key2 = 'val2', ...) WITH ('prop_key' = 'prop_val', ...) LOCATION '...';
ALTER TABLE table_name [IF EXISTS] DROP COLUMN PARTITION (key1
[=|<|<=|>|>=|!=] 'val1', key2 ...,);
{code}
was:
Currently, Tajo does not manage partitioned directly. In Tajo, each partition
is just a directory. For each query, a logical planner traverses matched
directories in HDFS according to partition predicates.
This approach is not efficient especially in the environment where the number
of partitions are very large. It also makes partition management hard.
Tajo should manage partitions directly by using ALTER TABLE ADD/DROP PARTITION
statements. A number of partition entries should be stored in the underlying
database that catalog uses.
{code:title=Synopsis of ALTER TABLE ADD/DROP PARTITION}
ALTER TABLE table_name [IF NOT EXISTS] ADD COLUMN PARTITION (key1 = 'val2',
key2 = 'val2', ...) WITH ('prop_key' = 'prop_val', ...) LOCATION '...';
ALTER TABLE table_name [IF EXISTS] DROP COLUMN PARTITION (key1
[=|<|<=|>|>=|!=] 'val1');
{code}
> (Umbrella) ALTER TABLE ADD/DROP PARTITION statement
> ---------------------------------------------------
>
> Key: TAJO-744
> URL: https://issues.apache.org/jira/browse/TAJO-744
> Project: Tajo
> Issue Type: New Feature
> Components: catalog
> Reporter: Hyunsik Choi
> Assignee: Jaehwa Jung
> Attachments: TAJO-744.Henrick-140423.01.patch.txt
>
>
> Currently, Tajo does not manage partitioned directly. In Tajo, each partition
> is just a directory. For each query, a logical planner traverses matched
> directories in HDFS according to partition predicates.
> This approach is not efficient especially in the environment where the number
> of partitions are very large. It also makes partition management hard.
> Tajo should manage partitions directly by using ALTER TABLE ADD/DROP
> PARTITION statements. A number of partition entries should be stored in the
> underlying database that catalog uses.
> {code:title=Synopsis of ALTER TABLE ADD/DROP PARTITION}
> ALTER TABLE table_name [IF NOT EXISTS] ADD COLUMN PARTITION (key1 = 'val2',
> key2 = 'val2', ...) WITH ('prop_key' = 'prop_val', ...) LOCATION '...';
> ALTER TABLE table_name [IF EXISTS] DROP COLUMN PARTITION (key1
> [=|<|<=|>|>=|!=] 'val1', key2 ...,);
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)