[ 
https://issues.apache.org/jira/browse/TAJO-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13888442#comment-13888442
 ] 

Hyunsik Choi commented on TAJO-574:
-----------------------------------

Created a review request against branch master in reviewboard 
https://reviews.apache.org/r/17633/


> Add a sort-based physical executor for column partition store
> -------------------------------------------------------------
>
>                 Key: TAJO-574
>                 URL: https://issues.apache.org/jira/browse/TAJO-574
>             Project: Tajo
>          Issue Type: New Feature
>          Components: physical operator
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>             Fix For: 0.8-incubating
>
>         Attachments: TAJO-574.patch
>
>
> ColumnPartitionStoreExec keeps numerous open files while it is storing all 
> data. In addition, it's random write gives burden to HDFS namenode.
> To solve this problem, I would like to propose a sort-based physical executor 
> for column partition store. It assumes that input tuples are sorted in an 
> ascending or descending order of partition keys. It means that it needs extra 
> sort operation. But, it opens only one file simultaneously. It writes all 
> data sequentially. In many cases, it would be the best choice for column 
> partition store.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to