[
https://issues.apache.org/jira/browse/TAJO-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14375268#comment-14375268
]
ASF GitHub Bot commented on TAJO-1403:
--------------------------------------
Github user jihoonson commented on the pull request:
https://github.com/apache/tajo/pull/434#issuecomment-84746926
I think that it is possible to prune unnecessary partitions for the last
two cases without hurting our philosophy as follows.
When visiting a partition directory,
* if the current directory is leaf, read all files.
* otherwise,
* if some conditions are given for the current partition, visit only
qualified sub-directory.
* otherwise, visit all sub-directory.
Anyway, here is my +1. I'll commit it.
However, I missed one more comment. EvalTreeUtil is the more proper class
for checkIfPartitionSelection() and getPartitionValue() functions rather than
PlannerUtil because every function related to EvalNode is located in that class.
If you don't mind, I'll move those functions to EvalTreeUtil before commit.
> Improve 'Simple Query' with only partition columns and constant values
> ----------------------------------------------------------------------
>
> Key: TAJO-1403
> URL: https://issues.apache.org/jira/browse/TAJO-1403
> Project: Tajo
> Issue Type: Improvement
> Reporter: Dongjoon Hyun
> Assignee: Dongjoon Hyun
> Fix For: 0.11.0
>
> Attachments: TAJO-1403.patch
>
>
> Tajo shows a very fast response for a simple query (
> https://cwiki.apache.org/confluence/display/TAJO/Simple+Query+and+Forwarded+Query)
> like the followings.
> {code:sql}
> select * from t1 limit 10;
> {code}
> However, in many cases, tables have partitions.
> {code:sql}
> create external table t1(id int) using csv with ('csvfile.delimiter'='|')
> partition by column(dt text) location '/data';
> select * from t1 where dt='2015-03-15' limit 10;
> {code}
> If all predicates in WHERE consist of partition columns and 'EQUAL'
> predicates with constant values, I think Tajo can handle these cases very
> fast.
> This kind of queries is very popular for DevOps users and simple ETL apps.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)