[ 
https://issues.apache.org/jira/browse/TAJO-838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14166171#comment-14166171
 ] 

Jihoon Son commented on TAJO-838:
---------------------------------

Hi [~mhthanh], thanks for your comment.

You are right. The index creation and maintenance are also very important. The 
index creation time was about 125 seconds. 

On the index maintenance, I think that there will be two cases. The first case 
is when an index is created for a partitioned table. Since new data are 
appended to the partitioned table, we definitely need to support automatic 
(implicit) index update for it. This is a remaining issue that must be 
resolved. The second case is that a user deletes a table that has an index. In 
this case, we also need to remove the index automatically. 

Since HDFS is an append-only file system, I'm hard to imagine other cases. If I 
miss something, please let me know.

On the index structure, an index provide only pointers to the real data. Data 
are not packed inside the index. 

Sincerely,
Jihoon

> Improve query planner to utilize index
> --------------------------------------
>
>                 Key: TAJO-838
>                 URL: https://issues.apache.org/jira/browse/TAJO-838
>             Project: Tajo
>          Issue Type: Sub-task
>          Components: planner/optimizer
>            Reporter: Jihoon Son
>            Assignee: Jihoon Son
>            Priority: Minor
>
> Index can improve the query performance when the selectivity of query is high.
> Thus, query planner should decide whether index is used or not for a given 
> query.
> The selectivity can be guessed using statistics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to